tech-invite   World Map     

3GPP     Specs     Glossaries     Architecture     IMS     UICC       IETF     RFCs     Groups     SIP     ABNFs       Search

RFC 8216

Informational
Pages: 60
Top     in Index     Prev     Next
in Group Index     Prev in Group     No Next: Highest Number in Group     Group: ~http

HTTP Live Streaming

Part 1 of 4, p. 1 to 9
None       Next Section

 


Top       ToC       Page 1 
Independent Submission                                    R. Pantos, Ed.
Request for Comments: 8216                                   Apple, Inc.
Category: Informational                                           W. May
ISSN: 2070-1721                                       MLB Advanced Media
                                                             August 2017


                          HTTP Live Streaming

Abstract

   This document describes a protocol for transferring unbounded streams
   of multimedia data.  It specifies the data format of the files and
   the actions to be taken by the server (sender) and the clients
   (receivers) of the streams.  It describes version 7 of this protocol.

Status of This Memo

   This document is not an Internet Standards Track specification; it is
   published for informational purposes.

   This is a contribution to the RFC Series, independently of any other
   RFC stream.  The RFC Editor has chosen to publish this document at
   its discretion and makes no statement about its value for
   implementation or deployment.  Documents approved for publication by
   the RFC Editor are not a candidate for any level of Internet
   Standard; see Section 2 of RFC 7841.

   Information about the current status of this document, any errata,
   and how to provide feedback on it may be obtained at
   http://www.rfc-editor.org/info/rfc8216.

Copyright Notice

   Copyright (c) 2017 IETF Trust and the persons identified as the
   document authors.  All rights reserved.

   This document is subject to BCP 78 and the IETF Trust's Legal
   Provisions Relating to IETF Documents
   (http://trustee.ietf.org/license-info) in effect on the date of
   publication of this document.  Please review these documents
   carefully, as they describe your rights and restrictions with respect
   to this document.

   This document may not be modified, and derivative works of it may not
   be created, except to format it for publication as an RFC or to
   translate it into languages other than English.

Top       Page 2 
Table of Contents

   1. Introduction to HTTP Live Streaming .............................4
   2. Overview ........................................................4
   3. Media Segments ..................................................6
      3.1. Supported Media Segment Formats ............................6
      3.2. MPEG-2 Transport Streams ...................................7
      3.3. Fragmented MPEG-4 ..........................................7
      3.4. Packed Audio ...............................................8
      3.5. WebVTT .....................................................8
   4. Playlists .......................................................9
      4.1. Definition of a Playlist ..................................10
      4.2. Attribute Lists ...........................................11
      4.3. Playlist Tags .............................................12
           4.3.1. Basic Tags .........................................12
                  4.3.1.1. EXTM3U ....................................12
                  4.3.1.2. EXT-X-VERSION .............................12
           4.3.2. Media Segment Tags .................................13
                  4.3.2.1. EXTINF ....................................13
                  4.3.2.2. EXT-X-BYTERANGE ...........................14
                  4.3.2.3. EXT-X-DISCONTINUITY .......................14
                  4.3.2.4. EXT-X-KEY .................................15
                  4.3.2.5. EXT-X-MAP .................................17
                  4.3.2.6. EXT-X-PROGRAM-DATE-TIME ...................18
                  4.3.2.7. EXT-X-DATERANGE ...........................18
                           4.3.2.7.1. Mapping SCTE-35 into
                                      EXT-X-DATERANGE ................20
           4.3.3. Media Playlist Tags ................................22
                  4.3.3.1. EXT-X-TARGETDURATION ......................22
                  4.3.3.2. EXT-X-MEDIA-SEQUENCE ......................22
                  4.3.3.3. EXT-X-DISCONTINUITY-SEQUENCE ..............23
                  4.3.3.4. EXT-X-ENDLIST .............................23
                  4.3.3.5. EXT-X-PLAYLIST-TYPE .......................24
                  4.3.3.6. EXT-X-I-FRAMES-ONLY .......................24
           4.3.4. Master Playlist Tags ...............................25
                  4.3.4.1. EXT-X-MEDIA ...............................25
                           4.3.4.1.1. Rendition Groups ...............28
                  4.3.4.2. EXT-X-STREAM-INF ..........................29
                           4.3.4.2.1. Alternative Renditions .........32
                  4.3.4.3. EXT-X-I-FRAME-STREAM-INF ..................33
                  4.3.4.4. EXT-X-SESSION-DATA ........................34
                  4.3.4.5. EXT-X-SESSION-KEY .........................35
           4.3.5. Media or Master Playlist Tags ......................35
                  4.3.5.1. EXT-X-INDEPENDENT-SEGMENTS ................35
                  4.3.5.2. EXT-X-START ...............................36

Top      ToC       Page 3 
   5. Key Files ......................................................37
      5.1. Structure of Key Files ....................................37
      5.2. IV for AES-128 ............................................37
   6. Client/Server Responsibilities .................................37
      6.1. Introduction ..............................................37
      6.2. Server Responsibilities ...................................37
           6.2.1. General Server Responsibilities ....................37
           6.2.2. Live Playlists .....................................40
           6.2.3. Encrypting Media Segments ..........................41
           6.2.4. Providing Variant Streams ..........................42
      6.3. Client Responsibilities ...................................44
           6.3.1. General Client Responsibilities ....................44
           6.3.2. Loading the Media Playlist File ....................44
           6.3.3. Playing the Media Playlist File ....................45
           6.3.4. Reloading the Media Playlist File ..................46
           6.3.5. Determining the Next Segment to Load ...............47
           6.3.6. Decrypting Encrypted Media Segments ................47
   7. Protocol Version Compatibility .................................48
   8. Playlist Examples ..............................................50
      8.1. Simple Media Playlist .....................................50
      8.2. Live Media Playlist Using HTTPS ...........................50
      8.3. Playlist with Encrypted Media Segments ....................51
      8.4. Master Playlist ...........................................51
      8.5. Master Playlist with I-Frames .............................51
      8.6. Master Playlist with Alternative Audio ....................52
      8.7. Master Playlist with Alternative Video ....................52
      8.8. Session Data in a Master Playlist .........................53
      8.9. CHARACTERISTICS Attribute Containing Multiple
           Characteristics ...........................................54
      8.10. EXT-X-DATERANGE Carrying SCTE-35 Tags ....................54
   9. IANA Considerations ............................................54
   10. Security Considerations .......................................55
   11. References ....................................................56
      11.1. Normative References .....................................56
      11.2. Informative References ...................................59
   Contributors ......................................................60
   Authors' Addresses ................................................60

Top      ToC       Page 4 
1.  Introduction to HTTP Live Streaming

   HTTP Live Streaming provides a reliable, cost-effective means of
   delivering continuous and long-form video over the Internet.  It
   allows a receiver to adapt the bit rate of the media to the current
   network conditions in order to maintain uninterrupted playback at the
   best possible quality.  It supports interstitial content boundaries.
   It provides a flexible framework for media encryption.  It can
   efficiently offer multiple renditions of the same content, such as
   audio translations.  It offers compatibility with large-scale HTTP
   caching infrastructure to support delivery to large audiences.

   Since the Internet-Draft was first posted in 2009, HTTP Live
   Streaming has been implemented and deployed by a wide array of
   content producers, tools vendors, distributors, and device
   manufacturers.  In the subsequent eight years, the protocol has been
   refined by extensive review and discussion with a variety of media
   streaming implementors.

   The purpose of this document is to facilitate interoperability
   between HTTP Live Streaming implementations by describing the media
   transmission protocol.  Using this protocol, a client can receive a
   continuous stream of media from a server for concurrent presentation.

   This document describes version 7 of the protocol.

2.  Overview

   A multimedia presentation is specified by a Uniform Resource
   Identifier (URI) [RFC3986] to a Playlist.

   A Playlist is either a Media Playlist or a Master Playlist.  Both are
   UTF-8 text files containing URIs and descriptive tags.

   A Media Playlist contains a list of Media Segments, which, when
   played sequentially, will play the multimedia presentation.

Top      ToC       Page 5 
   Here is an example of a Media Playlist:

   #EXTM3U
   #EXT-X-TARGETDURATION:10

   #EXTINF:9.009,
   http://media.example.com/first.ts
   #EXTINF:9.009,
   http://media.example.com/second.ts
   #EXTINF:3.003,
   http://media.example.com/third.ts

   The first line is the format identifier tag #EXTM3U.  The line
   containing #EXT-X-TARGETDURATION says that all Media Segments will be
   10 seconds long or less.  Then, three Media Segments are declared.
   The first and second are 9.009 seconds long; the third is 3.003
   seconds.

   To play this Playlist, the client first downloads it and then
   downloads and plays each Media Segment declared within it.  The
   client reloads the Playlist as described in this document to discover
   any added segments.  Data SHOULD be carried over HTTP [RFC7230], but,
   in general, a URI can specify any protocol that can reliably transfer
   the specified resource on demand.

   A more complex presentation can be described by a Master Playlist.  A
   Master Playlist provides a set of Variant Streams, each of which
   describes a different version of the same content.

   A Variant Stream includes a Media Playlist that specifies media
   encoded at a particular bit rate, in a particular format, and at a
   particular resolution for media containing video.

   A Variant Stream can also specify a set of Renditions.  Renditions
   are alternate versions of the content, such as audio produced in
   different languages or video recorded from different camera angles.

   Clients should switch between different Variant Streams to adapt to
   network conditions.  Clients should choose Renditions based on user
   preferences.

   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
   "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and
   "OPTIONAL" in this document are to be interpreted as described in
   BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all
   capitals, as shown here.

Top      ToC       Page 6 
3.  Media Segments

   A Media Playlist contains a series of Media Segments that make up the
   overall presentation.  A Media Segment is specified by a URI and
   optionally a byte range.

   The duration of each Media Segment is indicated in the Media Playlist
   by its EXTINF tag (Section 4.3.2.1).

   Each segment in a Media Playlist has a unique integer Media Sequence
   Number.  The Media Sequence Number of the first segment in the Media
   Playlist is either 0 or declared in the Playlist (Section 4.3.3.2).
   The Media Sequence Number of every other segment is equal to the
   Media Sequence Number of the segment that precedes it plus one.

   Each Media Segment MUST carry the continuation of the encoded
   bitstream from the end of the segment with the previous Media
   Sequence Number, where values in a series such as timestamps and
   Continuity Counters MUST continue uninterrupted.  The only exceptions
   are the first Media Segment ever to appear in a Media Playlist and
   Media Segments that are explicitly signaled as discontinuities
   (Section 4.3.2.3).  Unmarked media discontinuities can trigger
   playback errors.

   Any Media Segment that contains video SHOULD include enough
   information to initialize a video decoder and decode a continuous set
   of frames that includes the final frame in the Segment; network
   efficiency is optimized if there is enough information in the Segment
   to decode all frames in the Segment.  For example, any Media Segment
   containing H.264 video SHOULD contain an Instantaneous Decoding
   Refresh (IDR); frames prior to the first IDR will be downloaded but
   possibly discarded.

3.1.  Supported Media Segment Formats

   All Media Segments MUST be in a format described in this section.
   Transport of other media file formats is not defined.

   Some media formats require a common sequence of bytes to initialize a
   parser before a Media Segment can be parsed.  This format-specific
   sequence is called the Media Initialization Section.  The Media
   Initialization Section can be specified by an EXT-X-MAP tag
   (Section 4.3.2.5).  The Media Initialization Section MUST NOT contain
   sample data.

Top      ToC       Page 7 
3.2.  MPEG-2 Transport Streams

   MPEG-2 Transport Streams are specified by [ISO_13818].

   The Media Initialization Section of an MPEG-2 Transport Stream
   Segment is a Program Association Table (PAT) followed by a Program
   Map Table (PMT).

   Transport Stream Segments MUST contain a single MPEG-2 Program;
   playback of Multi-Program Transport Streams is not defined.  Each
   Transport Stream Segment MUST contain a PAT and a PMT, or have an
   EXT-X-MAP tag (Section 4.3.2.5) applied to it.  The first two
   Transport Stream packets in a Segment without an EXT-X-MAP tag SHOULD
   be a PAT and a PMT.

3.3.  Fragmented MPEG-4

   MPEG-4 Fragments are specified by the ISO Base Media File Format
   [ISOBMFF].  Unlike regular MPEG-4 files that have a Movie Box
   ('moov') that contains sample tables and a Media Data Box ('mdat')
   containing the corresponding samples, an MPEG-4 Fragment consists of
   a Movie Fragment Box ('moof') containing a subset of the sample table
   and a Media Data Box containing those samples.  Use of MPEG-4
   Fragments does require a Movie Box for initialization, but that Movie
   Box contains only non-sample-specific information such as track and
   sample descriptions.

   A Fragmented MPEG-4 (fMP4) Segment is a "segment" as defined by
   Section 3 of [ISOBMFF], including the constraints on Media Data Boxes
   in Section 8.16 of [ISOBMFF].

   The Media Initialization Section for an fMP4 Segment is an ISO Base
   Media File that can initialize a parser for that Segment.

   Broadly speaking, fMP4 Segments and Media Initialization Sections are
   [ISOBMFF] files that also satisfy the constraints described in this
   section.

   The Media Initialization Section for an fMP4 Segment MUST contain a
   File Type Box ('ftyp') containing a brand that is compatible with
   'iso6' or higher.  The File Type Box MUST be followed by a Movie Box.
   The Movie Box MUST contain a Track Box ('trak') for every Track
   Fragment Box ('traf') in the fMP4 Segment, with matching track_ID.
   Each Track Box SHOULD contain a sample table, but its sample count
   MUST be zero.  Movie Header Boxes ('mvhd') and Track Header Boxes
   ('tkhd') MUST have durations of zero.  A Movie Extends Box ('mvex')
   MUST follow the last Track Box.  Note that a Common Media Application
   Format (CMAF) Header [CMAF] meets all these requirements.

Top      ToC       Page 8 
   In an fMP4 Segment, every Track Fragment Box MUST contain a Track
   Fragment Decode Time Box ('tfdt'). fMP4 Segments MUST use movie-
   fragment-relative addressing. fMP4 Segments MUST NOT use external
   data references.  Note that a CMAF Segment meets these requirements.

   An fMP4 Segment in a Playlist containing the EXT-X-I-FRAMES-ONLY tag
   (Section 4.3.3.6) MAY omit the portion of the Media Data Box
   following the intra-coded frame (I-frame) sample data.

   Each fMP4 Segment in a Media Playlist MUST have an EXT-X-MAP tag
   applied to it.

3.4.  Packed Audio

   A Packed Audio Segment contains encoded audio samples and ID3 tags
   that are simply packed together with minimal framing and no per-
   sample timestamps.  Supported Packed Audio formats are Advanced Audio
   Coding (AAC) with Audio Data Transport Stream (ADTS) framing
   [ISO_13818_7], MP3 [ISO_13818_3], AC-3 [AC_3], and Enhanced AC-3
   [AC_3].

   A Packed Audio Segment has no Media Initialization Section.

   Each Packed Audio Segment MUST signal the timestamp of its first
   sample with an ID3 Private frame (PRIV) tag [ID3] at the beginning of
   the segment.  The ID3 PRIV owner identifier MUST be
   "com.apple.streaming.transportStreamTimestamp".  The ID3 payload MUST
   be a 33-bit MPEG-2 Program Elementary Stream timestamp expressed as a
   big-endian eight-octet number, with the upper 31 bits set to zero.
   Clients SHOULD NOT play Packed Audio Segments without this ID3 tag.

3.5.  WebVTT

   A WebVTT Segment is a section of a WebVTT [WebVTT] file.  WebVTT
   Segments carry subtitles.

   The Media Initialization Section of a WebVTT Segment is the WebVTT
   header.

   Each WebVTT Segment MUST contain all subtitle cues that are intended
   to be displayed during the period indicated by the segment EXTINF
   duration.  The start time offset and end time offset of each cue MUST
   indicate the total display time for that cue, even if part of the cue
   time range is outside the Segment period.  A WebVTT Segment MAY
   contain no cues; this indicates that no subtitles are to be displayed
   during that period.

Top      ToC       Page 9 
   Each WebVTT Segment MUST either start with a WebVTT header or have an
   EXT-X-MAP tag applied to it.

   In order to synchronize timestamps between audio/video and subtitles,
   an X-TIMESTAMP-MAP metadata header SHOULD be added to each WebVTT
   header.  This header maps WebVTT cue timestamps to MPEG-2 (PES)
   timestamps in other Renditions of the Variant Stream.  Its format is:

   X-TIMESTAMP-MAP=LOCAL:<cue time>,MPEGTS:<MPEG-2 time>
   e.g., X-TIMESTAMP-MAP=LOCAL:00:00:00.000,MPEGTS:900000

   The cue timestamp in the LOCAL attribute MAY fall outside the range
   of time covered by the segment.

   If a WebVTT segment does not have the X-TIMESTAMP-MAP, the client
   MUST assume that the WebVTT cue time of 0 maps to an MPEG-2 timestamp
   of 0.

   When synchronizing WebVTT with PES timestamps, clients SHOULD account
   for cases where the 33-bit PES timestamps have wrapped and the WebVTT
   cue times have not.



(page 9 continued on part 2)

Next Section