RFC 2083

PNG (Portable Network Graphics) Specification Version 1.0

Pages: 102
Informational

Part 2 of 4 – Pages 15 to 40

RFC2083 - Page 15 prevText

4. Chunk Specifications

   This chapter defines the standard types of PNG chunks.

   4.1. Critical chunks

      All implementations must understand and successfully render the
      standard critical chunks.  A valid PNG image must contain an IHDR
      chunk, one or more IDAT chunks, and an IEND chunk.

      4.1.1. IHDR Image header

         The IHDR chunk must appear FIRST.  It contains:

            Width:              4 bytes
            Height:             4 bytes
            Bit depth:          1 byte
            Color type:         1 byte
            Compression method: 1 byte
            Filter method:      1 byte
            Interlace method:   1 byte

RFC2083 - Page 16

         Width and height give the image dimensions in pixels.  They are
         4-byte integers. Zero is an invalid value. The maximum for each
         is (2^31)-1 in order to accommodate languages that have
         difficulty with unsigned 4-byte values.

         Bit depth is a single-byte integer giving the number of bits
         per sample or per palette index (not per pixel).  Valid values
         are 1, 2, 4, 8, and 16, although not all values are allowed for
         all color types.

         Color type is a single-byte integer that describes the
         interpretation of the image data.  Color type codes represent
         sums of the following values: 1 (palette used), 2 (color used),
         and 4 (alpha channel used). Valid values are 0, 2, 3, 4, and 6.

         Bit depth restrictions for each color type are imposed to
         simplify implementations and to prohibit combinations that do
         not compress well.  Decoders must support all legal
         combinations of bit depth and color type.  The allowed
         combinations are:

            Color    Allowed    Interpretation
            Type    Bit Depths

            0       1,2,4,8,16  Each pixel is a grayscale sample.

            2       8,16        Each pixel is an R,G,B triple.

            3       1,2,4,8     Each pixel is a palette index;
                                a PLTE chunk must appear.

            4       8,16        Each pixel is a grayscale sample,
                                followed by an alpha sample.

            6       8,16        Each pixel is an R,G,B triple,
                                followed by an alpha sample.

         The sample depth is the same as the bit depth except in the
         case of color type 3, in which the sample depth is always 8
         bits.

         Compression method is a single-byte integer that indicates the
         method used to compress the image data.  At present, only
         compression method 0 (deflate/inflate compression with a 32K
         sliding window) is defined.  All standard PNG images must be
         compressed with this scheme.  The compression method field is
         provided for possible future expansion or proprietary variants.
         Decoders must check this byte and report an error if it holds

RFC2083 - Page 17

         an unrecognized code.  See Deflate/Inflate Compression (Chapter
         5) for details.

         Filter method is a single-byte integer that indicates the
         preprocessing method applied to the image data before
         compression.  At present, only filter method 0 (adaptive
         filtering with five basic filter types) is defined.  As with
         the compression method field, decoders must check this byte and
         report an error if it holds an unrecognized code.  See Filter
         Algorithms (Chapter 6) for details.

         Interlace method is a single-byte integer that indicates the
         transmission order of the image data.  Two values are currently
         defined: 0 (no interlace) or 1 (Adam7 interlace).  See
         Interlaced data order (Section 2.6) for details.

      4.1.2. PLTE Palette

         The PLTE chunk contains from 1 to 256 palette entries, each a
         three-byte series of the form:

            Red:   1 byte (0 = black, 255 = red)
            Green: 1 byte (0 = black, 255 = green)
            Blue:  1 byte (0 = black, 255 = blue)

         The number of entries is determined from the chunk length.  A
         chunk length not divisible by 3 is an error.

         This chunk must appear for color type 3, and can appear for
         color types 2 and 6; it must not appear for color types 0 and
         4. If this chunk does appear, it must precede the first IDAT
         chunk.  There must not be more than one PLTE chunk.

         For color type 3 (indexed color), the PLTE chunk is required.
         The first entry in PLTE is referenced by pixel value 0, the
         second by pixel value 1, etc.  The number of palette entries
         must not exceed the range that can be represented in the image
         bit depth (for example, 2^4 = 16 for a bit depth of 4).  It is
         permissible to have fewer entries than the bit depth would
         allow.  In that case, any out-of-range pixel value found in the
         image data is an error.

         For color types 2 and 6 (truecolor and truecolor with alpha),
         the PLTE chunk is optional.  If present, it provides a
         suggested set of from 1 to 256 colors to which the truecolor
         image can be quantized if the viewer cannot display truecolor
         directly.  If PLTE is not present, such a viewer will need to
         select colors on its own, but it is often preferable for this

RFC2083 - Page 18

         to be done once by the encoder.  (See Recommendations for
         Encoders: Suggested palettes, Section 9.5.)

         Note that the palette uses 8 bits (1 byte) per sample
         regardless of the image bit depth specification.  In
         particular, the palette is 8 bits deep even when it is a
         suggested quantization of a 16-bit truecolor image.

         There is no requirement that the palette entries all be used by
         the image, nor that they all be different.

      4.1.3. IDAT Image data

         The IDAT chunk contains the actual image data.  To create this
         data:

             * Begin with image scanlines represented as described in
               Image layout (Section 2.3); the layout and total size of
               this raw data are determined by the fields of IHDR.
             * Filter the image data according to the filtering method
               specified by the IHDR chunk.  (Note that with filter
               method 0, the only one currently defined, this implies
               prepending a filter type byte to each scanline.)
             * Compress the filtered data using the compression method
               specified by the IHDR chunk.

         The IDAT chunk contains the output datastream of the
         compression algorithm.

         To read the image data, reverse this process.

         There can be multiple IDAT chunks; if so, they must appear
         consecutively with no other intervening chunks.  The compressed
         datastream is then the concatenation of the contents of all the
         IDAT chunks.  The encoder can divide the compressed datastream
         into IDAT chunks however it wishes.  (Multiple IDAT chunks are
         allowed so that encoders can work in a fixed amount of memory;
         typically the chunk size will correspond to the encoder's
         buffer size.) It is important to emphasize that IDAT chunk
         boundaries have no semantic significance and can occur at any
         point in the compressed datastream.  A PNG file in which each
         IDAT chunk contains only one data byte is legal, though
         remarkably wasteful of space.  (For that matter, zero-length
         IDAT chunks are legal, though even more wasteful.)

         See Filter Algorithms (Chapter 6) and Deflate/Inflate
         Compression (Chapter 5) for details.

RFC2083 - Page 19

      4.1.4. IEND Image trailer

         The IEND chunk must appear LAST.  It marks the end of the PNG
         datastream.  The chunk's data field is empty.

   4.2. Ancillary chunks

      All ancillary chunks are optional, in the sense that encoders need
      not write them and decoders can ignore them.  However, encoders
      are encouraged to write the standard ancillary chunks when the
      information is available, and decoders are encouraged to interpret
      these chunks when appropriate and feasible.

      The standard ancillary chunks are listed in alphabetical order.
      This is not necessarily the order in which they would appear in a
      file.

      4.2.1. bKGD Background color

         The bKGD chunk specifies a default background color to present
         the image against.  Note that viewers are not bound to honor
         this chunk; a viewer can choose to use a different background.

         For color type 3 (indexed color), the bKGD chunk contains:

            Palette index:  1 byte

         The value is the palette index of the color to be used as
         background.

         For color types 0 and 4 (grayscale, with or without alpha),
         bKGD contains:

            Gray:  2 bytes, range 0 .. (2^bitdepth)-1

         (For consistency, 2 bytes are used regardless of the image bit
         depth.)  The value is the gray level to be used as background.

         For color types 2 and 6 (truecolor, with or without alpha),
         bKGD contains:

            Red:   2 bytes, range 0 .. (2^bitdepth)-1
            Green: 2 bytes, range 0 .. (2^bitdepth)-1
            Blue:  2 bytes, range 0 .. (2^bitdepth)-1

         (For consistency, 2 bytes per sample are used regardless of the
         image bit depth.)  This is the RGB color to be used as
         background.

RFC2083 - Page 20

         When present, the bKGD chunk must precede the first IDAT chunk,
         and must follow the PLTE chunk, if any.

         See Recommendations for Decoders: Background color (Section
         10.7).

      4.2.2. cHRM Primary chromaticities and white point

         Applications that need device-independent specification of
         colors in a PNG file can use the cHRM chunk to specify the 1931
         CIE x,y chromaticities of the red, green, and blue primaries
         used in the image, and the referenced white point. See Color
         Tutorial (Chapter 14) for more information.

         The cHRM chunk contains:

            White Point x: 4 bytes
            White Point y: 4 bytes
            Red x:         4 bytes
            Red y:         4 bytes
            Green x:       4 bytes
            Green y:       4 bytes
            Blue x:        4 bytes
            Blue y:        4 bytes

         Each value is encoded as a 4-byte unsigned integer,
         representing the x or y value times 100000.  For example, a
         value of 0.3127 would be stored as the integer 31270.

         cHRM is allowed in all PNG files, although it is of little
         value for grayscale images.

         If the encoder does not know the chromaticity values, it should
         not write a cHRM chunk; the absence of a cHRM chunk indicates
         that the image's primary colors are device-dependent.

         If the cHRM chunk appears, it must precede the first IDAT
         chunk, and it must also precede the PLTE chunk if present.

         See Recommendations for Encoders: Encoder color handling
         (Section 9.3), and Recommendations for Decoders: Decoder color
         handling (Section 10.6).

RFC2083 - Page 21

      4.2.3. gAMA Image gamma

         The gAMA chunk specifies the gamma of the camera (or simulated
         camera) that produced the image, and thus the gamma of the
         image with respect to the original scene.  More precisely, the
         gAMA chunk encodes the file_gamma value, as defined in Gamma
         Tutorial (Chapter 13).

         The gAMA chunk contains:

            Image gamma: 4 bytes

         The value is encoded as a 4-byte unsigned integer, representing
         gamma times 100000.  For example, a gamma of 0.45 would be
         stored as the integer 45000.

         If the encoder does not know the image's gamma value, it should
         not write a gAMA chunk; the absence of a gAMA chunk indicates
         that the gamma is unknown.

         If the gAMA chunk appears, it must precede the first IDAT
         chunk, and it must also precede the PLTE chunk if present.

         See Gamma correction (Section 2.7), Recommendations for
         Encoders: Encoder gamma handling (Section 9.2), and
         Recommendations for Decoders: Decoder gamma handling (Section
         10.5).

      4.2.4. hIST Image histogram

         The hIST chunk gives the approximate usage frequency of each
         color in the color palette.  A histogram chunk can appear only
         when a palette chunk appears.  If a viewer is unable to provide
         all the colors listed in the palette, the histogram may help it
         decide how to choose a subset of the colors for display.

         The hIST chunk contains a series of 2-byte (16 bit) unsigned
         integers.  There must be exactly one entry for each entry in
         the PLTE chunk.  Each entry is proportional to the fraction of
         pixels in the image that have that palette index; the exact
         scale factor is chosen by the encoder.

         Histogram entries are approximate, with the exception that a
         zero entry specifies that the corresponding palette entry is
         not used at all in the image.  It is required that a histogram
         entry be nonzero if there are any pixels of that color.

RFC2083 - Page 22

         When the palette is a suggested quantization of a truecolor
         image, the histogram is necessarily approximate, since a
         decoder may map pixels to palette entries differently than the
         encoder did.  In this situation, zero entries should not
         appear.

         The hIST chunk, if it appears, must follow the PLTE chunk, and
         must precede the first IDAT chunk.

         See Rationale: Palette histograms (Section 12.14), and
         Recommendations for Decoders: Suggested-palette and histogram
         usage (Section 10.10).

      4.2.5. pHYs Physical pixel dimensions

         The pHYs chunk specifies the intended pixel size or aspect
         ratio for display of the image.  It contains:

            Pixels per unit, X axis: 4 bytes (unsigned integer)
            Pixels per unit, Y axis: 4 bytes (unsigned integer)
            Unit specifier:          1 byte

         The following values are legal for the unit specifier:

            0: unit is unknown
            1: unit is the meter

         When the unit specifier is 0, the pHYs chunk defines pixel
         aspect ratio only; the actual size of the pixels remains
         unspecified.

         Conversion note: one inch is equal to exactly 0.0254 meters.

         If this ancillary chunk is not present, pixels are assumed to
         be square, and the physical size of each pixel is unknown.

         If present, this chunk must precede the first IDAT chunk.

         See Recommendations for Decoders: Pixel dimensions (Section
         10.2).

      4.2.6. sBIT Significant bits

         To simplify decoders, PNG specifies that only certain sample
         depths can be used, and further specifies that sample values
         should be scaled to the full range of possible values at the
         sample depth.  However, the sBIT chunk is provided in order to
         store the original number of significant bits.  This allows

RFC2083 - Page 23

         decoders to recover the original data losslessly even if the
         data had a sample depth not directly supported by PNG.  We
         recommend that an encoder emit an sBIT chunk if it has
         converted the data from a lower sample depth.

         For color type 0 (grayscale), the sBIT chunk contains a single
         byte, indicating the number of bits that were significant in
         the source data.

         For color type 2 (truecolor), the sBIT chunk contains three
         bytes, indicating the number of bits that were significant in
         the source data for the red, green, and blue channels,
         respectively.

         For color type 3 (indexed color), the sBIT chunk contains three
         bytes, indicating the number of bits that were significant in
         the source data for the red, green, and blue components of the
         palette entries, respectively.

         For color type 4 (grayscale with alpha channel), the sBIT chunk
         contains two bytes, indicating the number of bits that were
         significant in the source grayscale data and the source alpha
         data, respectively.

         For color type 6 (truecolor with alpha channel), the sBIT chunk
         contains four bytes, indicating the number of bits that were
         significant in the source data for the red, green, blue and
         alpha channels, respectively.

         Each depth specified in sBIT must be greater than zero and less
         than or equal to the sample depth (which is 8 for indexed-color
         images, and the bit depth given in IHDR for other color types).

         A decoder need not pay attention to sBIT: the stored image is a
         valid PNG file of the sample depth indicated by IHDR.  However,
         if the decoder wishes to recover the original data at its
         original precision, this can be done by right-shifting the
         stored samples (the stored palette entries, for an indexed-
         color image).  The encoder must scale the data in such a way
         that the high-order bits match the original data.

         If the sBIT chunk appears, it must precede the first IDAT
         chunk, and it must also precede the PLTE chunk if present.

         See Recommendations for Encoders: Sample depth scaling (Section
         9.1) and Recommendations for Decoders: Sample depth rescaling
         (Section 10.4).

RFC2083 - Page 24

      4.2.7. tEXt Textual data

         Textual information that the encoder wishes to record with the
         image can be stored in tEXt chunks.  Each tEXt chunk contains a
         keyword and a text string, in the format:

            Keyword:        1-79 bytes (character string)
            Null separator: 1 byte
            Text:           n bytes (character string)

         The keyword and text string are separated by a zero byte (null
         character).  Neither the keyword nor the text string can
         contain a null character.  Note that the text string is not
         null-terminated (the length of the chunk is sufficient
         information to locate the ending).  The keyword must be at
         least one character and less than 80 characters long.  The text
         string can be of any length from zero bytes up to the maximum
         permissible chunk size less the length of the keyword and
         separator.

         Any number of tEXt chunks can appear, and more than one with
         the same keyword is permissible.

         The keyword indicates the type of information represented by
         the text string.  The following keywords are predefined and
         should be used where appropriate:

            Title            Short (one line) title or caption for image
            Author           Name of image's creator
            Description      Description of image (possibly long)
            Copyright        Copyright notice
            Creation Time    Time of original image creation
            Software         Software used to create the image
            Disclaimer       Legal disclaimer
            Warning          Warning of nature of content
            Source           Device used to create the image
            Comment          Miscellaneous comment; conversion from
                             GIF comment

         For the Creation Time keyword, the date format defined in
         section 5.2.14 of RFC 1123 is suggested, but not required
         [RFC-1123].  Decoders should allow for free-format text
         associated with this or any other keyword.

         Other keywords may be invented for other purposes.  Keywords of
         general interest can be registered with the maintainers of the
         PNG specification.  However, it is also permitted to use
         private unregistered keywords.  (Private keywords should be

RFC2083 - Page 25

         reasonably self-explanatory, in order to minimize the chance
         that the same keyword will be used for incompatible purposes by
         different people.)

         Both keyword and text are interpreted according to the ISO
         8859-1 (Latin-1) character set [ISO-8859].  The text string can
         contain any Latin-1 character.  Newlines in the text string
         should be represented by a single linefeed character (decimal
         10); use of other control characters in the text is
         discouraged.

         Keywords must contain only printable Latin-1 characters and
         spaces; that is, only character codes 32-126 and 161-255
         decimal are allowed.  To reduce the chances for human
         misreading of a keyword, leading and trailing spaces are
         forbidden, as are consecutive spaces.  Note also that the non-
         breaking space (code 160) is not permitted in keywords, since
         it is visually indistinguishable from an ordinary space.

         Keywords must be spelled exactly as registered, so that
         decoders can use simple literal comparisons when looking for
         particular keywords.  In particular, keywords are considered
         case-sensitive.

         See Recommendations for Encoders: Text chunk processing
         (Section 9.7) and Recommendations for Decoders: Text chunk
         processing (Section 10.11).

      4.2.8. tIME Image last-modification time

         The tIME chunk gives the time of the last image modification
         (not the time of initial image creation).  It contains:

            Year:   2 bytes (complete; for example, 1995, not 95)
            Month:  1 byte (1-12)
            Day:    1 byte (1-31)
            Hour:   1 byte (0-23)
            Minute: 1 byte (0-59)
            Second: 1 byte (0-60)    (yes, 60, for leap seconds; not 61,
                                      a common error)

         Universal Time (UTC, also called GMT) should be specified
         rather than local time.

RFC2083 - Page 26

         The tIME chunk is intended for use as an automatically-applied
         time stamp that is updated whenever the image data is changed.
         It is recommended that tIME not be changed by PNG editors that
         do not change the image data.  See also the Creation Time tEXt
         keyword, which can be used for a user-supplied time.

      4.2.9. tRNS Transparency

         The tRNS chunk specifies that the image uses simple
         transparency: either alpha values associated with palette
         entries (for indexed-color images) or a single transparent
         color (for grayscale and truecolor images).  Although simple
         transparency is not as elegant as the full alpha channel, it
         requires less storage space and is sufficient for many common
         cases.

         For color type 3 (indexed color), the tRNS chunk contains a
         series of one-byte alpha values, corresponding to entries in
         the PLTE chunk:

            Alpha for palette index 0:  1 byte
            Alpha for palette index 1:  1 byte
            ... etc ...

         Each entry indicates that pixels of the corresponding palette
         index must be treated as having the specified alpha value.
         Alpha values have the same interpretation as in an 8-bit full
         alpha channel: 0 is fully transparent, 255 is fully opaque,
         regardless of image bit depth. The tRNS chunk must not contain
         more alpha values than there are palette entries, but tRNS can
         contain fewer values than there are palette entries.  In this
         case, the alpha value for all remaining palette entries is
         assumed to be 255.  In the common case in which only palette
         index 0 need be made transparent, only a one-byte tRNS chunk is
         needed.

         For color type 0 (grayscale), the tRNS chunk contains a single
         gray level value, stored in the format:

            Gray:  2 bytes, range 0 .. (2^bitdepth)-1

         (For consistency, 2 bytes are used regardless of the image bit
         depth.) Pixels of the specified gray level are to be treated as
         transparent (equivalent to alpha value 0); all other pixels are
         to be treated as fully opaque (alpha value (2^bitdepth)-1).

RFC2083 - Page 27

         For color type 2 (truecolor), the tRNS chunk contains a single
         RGB color value, stored in the format:

            Red:   2 bytes, range 0 .. (2^bitdepth)-1
            Green: 2 bytes, range 0 .. (2^bitdepth)-1
            Blue:  2 bytes, range 0 .. (2^bitdepth)-1

         (For consistency, 2 bytes per sample are used regardless of the
         image bit depth.) Pixels of the specified color value are to be
         treated as transparent (equivalent to alpha value 0); all other
         pixels are to be treated as fully opaque (alpha value
         (2^bitdepth)-1).

         tRNS is prohibited for color types 4 and 6, since a full alpha
         channel is already present in those cases.

         Note: when dealing with 16-bit grayscale or truecolor data, it
         is important to compare both bytes of the sample values to
         determine whether a pixel is transparent.  Although decoders
         may drop the low-order byte of the samples for display, this
         must not occur until after the data has been tested for
         transparency.  For example, if the grayscale level 0x0001 is
         specified to be transparent, it would be incorrect to compare
         only the high-order byte and decide that 0x0002 is also
         transparent.

         When present, the tRNS chunk must precede the first IDAT chunk,
         and must follow the PLTE chunk, if any.

      4.2.10. zTXt Compressed textual data

         The zTXt chunk contains textual data, just as tEXt does;
         however, zTXt takes advantage of compression.  zTXt and tEXt
         chunks are semantically equivalent, but zTXt is recommended for
         storing large blocks of text.

         A zTXt chunk contains:

            Keyword:            1-79 bytes (character string)
            Null separator:     1 byte
            Compression method: 1 byte
            Compressed text:    n bytes

         The keyword and null separator are exactly the same as in the
         tEXt chunk.  Note that the keyword is not compressed.  The
         compression method byte identifies the compression method used
         in this zTXt chunk.  The only value presently defined for it is
         0 (deflate/inflate compression). The compression method byte is

RFC2083 - Page 28

         followed by a compressed datastream that makes up the remainder
         of the chunk.  For compression method 0, this datastream
         adheres to the zlib datastream format (see Deflate/Inflate
         Compression, Chapter 5).  Decompression of this datastream
         yields Latin-1 text that is identical to the text that would be
         stored in an equivalent tEXt chunk.

         Any number of zTXt and tEXt chunks can appear in the same file.
         See the preceding definition of the tEXt chunk for the
         predefined keywords and the recommended format of the text.

         See Recommendations for Encoders: Text chunk processing
         (Section 9.7), and Recommendations for Decoders: Text chunk
         processing (Section 10.11).

   4.3. Summary of standard chunks

      This table summarizes some properties of the standard chunk types.

         Critical chunks (must appear in this order, except PLTE
                          is optional):

                 Name  Multiple  Ordering constraints
                         OK?

                 IHDR    No      Must be first
                 PLTE    No      Before IDAT
                 IDAT    Yes     Multiple IDATs must be consecutive
                 IEND    No      Must be last

         Ancillary chunks (need not appear in this order):

                 Name  Multiple  Ordering constraints
                         OK?

                 cHRM    No      Before PLTE and IDAT
                 gAMA    No      Before PLTE and IDAT
                 sBIT    No      Before PLTE and IDAT
                 bKGD    No      After PLTE; before IDAT
                 hIST    No      After PLTE; before IDAT
                 tRNS    No      After PLTE; before IDAT
                 pHYs    No      Before IDAT
                 tIME    No      None
                 tEXt    Yes     None
                 zTXt    Yes     None

RFC2083 - Page 29

      Standard keywords for tEXt and zTXt chunks:

         Title            Short (one line) title or caption for image
         Author           Name of image's creator
         Description      Description of image (possibly long)
         Copyright        Copyright notice
         Creation Time    Time of original image creation
         Software         Software used to create the image
         Disclaimer       Legal disclaimer
         Warning          Warning of nature of content
         Source           Device used to create the image
         Comment          Miscellaneous comment; conversion from
                          GIF comment

   4.4. Additional chunk types

      Additional public PNG chunk types are defined in the document "PNG
      Special-Purpose Public Chunks" [PNG-EXTENSIONS].  Chunks described
      there are expected to be less widely supported than those defined
      in this specification.  However, application authors are
      encouraged to use those chunk types whenever appropriate for their
      applications.  Additional chunk types can be proposed for
      inclusion in that list by contacting the PNG specification
      maintainers at png-info@uunet.uu.net or at png-group@w3.org.

      New public chunks will only be registered if they are of use to
      others and do not violate the design philosophy of PNG. Chunk
      registration is not automatic, although it is the intent of the
      authors that it be straightforward when a new chunk of potentially
      wide application is needed.  Note that the creation of new
      critical chunk types is discouraged unless absolutely necessary.

      Applications can also use private chunk types to carry data that
      is not of interest to other applications.  See Recommendations for
      Encoders: Use of private chunks (Section 9.8).

      Decoders must be prepared to encounter unrecognized public or
      private chunk type codes.  Unrecognized chunk types must be
      handled as described in Chunk naming conventions (Section 3.3).

5. Deflate/Inflate Compression

   PNG compression method 0 (the only compression method presently
   defined for PNG) specifies deflate/inflate compression with a 32K
   sliding window.  Deflate compression is an LZ77 derivative used in
   zip, gzip, pkzip and related programs.  Extensive research has been
   done supporting its patent-free status.  Portable C implementations
   are freely available.

RFC2083 - Page 30

   Deflate-compressed datastreams within PNG are stored in the "zlib"
   format, which has the structure:

      Compression method/flags code: 1 byte
      Additional flags/check bits:   1 byte
      Compressed data blocks:        n bytes
      Check value:                   4 bytes

   Further details on this format are given in the zlib specification
   [RFC-1950].

   For PNG compression method 0, the zlib compression method/flags code
   must specify method code 8 ("deflate" compression) and an LZ77 window
   size of not more than 32K.  Note that the zlib compression method
   number is not the same as the PNG compression method number.  The
   additional flags must not specify a preset dictionary.

   The compressed data within the zlib datastream is stored as a series
   of blocks, each of which can represent raw (uncompressed) data,
   LZ77-compressed data encoded with fixed Huffman codes, or LZ77-
   compressed data encoded with custom Huffman codes.  A marker bit in
   the final block identifies it as the last block, allowing the decoder
   to recognize the end of the compressed datastream.  Further details
   on the compression algorithm and the encoding are given in the
   deflate specification [RFC-1951].

   The check value stored at the end of the zlib datastream is
   calculated on the uncompressed data represented by the datastream.
   Note that the algorithm used is not the same as the CRC calculation
   used for PNG chunk check values.  The zlib check value is useful
   mainly as a cross-check that the deflate and inflate algorithms are
   implemented correctly.  Verifying the chunk CRCs provides adequate
   confidence that the PNG file has been transmitted undamaged.

   In a PNG file, the concatenation of the contents of all the IDAT
   chunks makes up a zlib datastream as specified above.  This
   datastream decompresses to filtered image data as described elsewhere
   in this document.

   It is important to emphasize that the boundaries between IDAT chunks
   are arbitrary and can fall anywhere in the zlib datastream.  There is
   not necessarily any correlation between IDAT chunk boundaries and
   deflate block boundaries or any other feature of the zlib data.  For
   example, it is entirely possible for the terminating zlib check value
   to be split across IDAT chunks.

RFC2083 - Page 31

   In the same vein, there is no required correlation between the
   structure of the image data (i.e., scanline boundaries) and deflate
   block boundaries or IDAT chunk boundaries.  The complete image data
   is represented by a single zlib datastream that is stored in some
   number of IDAT chunks; a decoder that assumes any more than this is
   incorrect.  (Of course, some encoder implementations may emit files
   in which some of these structures are indeed related.  But decoders
   cannot rely on this.)

   PNG also uses zlib datastreams in zTXt chunks.  In a zTXt chunk, the
   remainder of the chunk following the compression method byte is a
   zlib datastream as specified above.  This datastream decompresses to
   the user-readable text described by the chunk's keyword.  Unlike the
   image data, such datastreams are not split across chunks; each zTXt
   chunk contains an independent zlib datastream.

   Additional documentation and portable C code for deflate and inflate
   are available from the Info-ZIP archives at
   <URL:ftp://ftp.uu.net/pub/archiving/zip/>.

6. Filter Algorithms

   This chapter describes the filter algorithms that can be applied
   before compression.  The purpose of these filters is to prepare the
   image data for optimum compression.

   6.1. Filter types

      PNG filter method 0 defines five basic filter types:

         Type    Name

         0       None
         1       Sub
         2       Up
         3       Average
         4       Paeth

      (Note that filter method 0 in IHDR specifies exactly this set of
      five filter types.  If the set of filter types is ever extended, a
      different filter method number will be assigned to the extended
      set, so that decoders need not decompress the data to discover
      that it contains unsupported filter types.)

      The encoder can choose which of these filter algorithms to apply
      on a scanline-by-scanline basis.  In the image data sent to the
      compression step, each scanline is preceded by a filter type byte
      that specifies the filter algorithm used for that scanline.

RFC2083 - Page 32

      Filtering algorithms are applied to bytes, not to pixels,
      regardless of the bit depth or color type of the image.  The
      filtering algorithms work on the byte sequence formed by a
      scanline that has been represented as described in Image layout
      (Section 2.3).  If the image includes an alpha channel, the alpha
      data is filtered in the same way as the image data.

      When the image is interlaced, each pass of the interlace pattern
      is treated as an independent image for filtering purposes.  The
      filters work on the byte sequences formed by the pixels actually
      transmitted during a pass, and the "previous scanline" is the one
      previously transmitted in the same pass, not the one adjacent in
      the complete image.  Note that the subimage transmitted in any one
      pass is always rectangular, but is of smaller width and/or height
      than the complete image.  Filtering is not applied when this
      subimage is empty.

      For all filters, the bytes "to the left of" the first pixel in a
      scanline must be treated as being zero.  For filters that refer to
      the prior scanline, the entire prior scanline must be treated as
      being zeroes for the first scanline of an image (or of a pass of
      an interlaced image).

      To reverse the effect of a filter, the decoder must use the
      decoded values of the prior pixel on the same line, the pixel
      immediately above the current pixel on the prior line, and the
      pixel just to the left of the pixel above.  This implies that at
      least one scanline's worth of image data will have to be stored by
      the decoder at all times.  Even though some filter types do not
      refer to the prior scanline, the decoder will always need to store
      each scanline as it is decoded, since the next scanline might use
      a filter that refers to it.

      PNG imposes no restriction on which filter types can be applied to
      an image.  However, the filters are not equally effective on all
      types of data.  See Recommendations for Encoders: Filter selection
      (Section 9.6).

      See also Rationale: Filtering (Section 12.9).

   6.2. Filter type 0: None

      With the None filter, the scanline is transmitted unmodified; it
      is only necessary to insert a filter type byte before the data.

RFC2083 - Page 33

   6.3. Filter type 1: Sub

      The Sub filter transmits the difference between each byte and the
      value of the corresponding byte of the prior pixel.

      To compute the Sub filter, apply the following formula to each
      byte of the scanline:

         Sub(x) = Raw(x) - Raw(x-bpp)

      where x ranges from zero to the number of bytes representing the
      scanline minus one, Raw(x) refers to the raw data byte at that
      byte position in the scanline, and bpp is defined as the number of
      bytes per complete pixel, rounding up to one. For example, for
      color type 2 with a bit depth of 16, bpp is equal to 6 (three
      samples, two bytes per sample); for color type 0 with a bit depth
      of 2, bpp is equal to 1 (rounding up); for color type 4 with a bit
      depth of 16, bpp is equal to 4 (two-byte grayscale sample, plus
      two-byte alpha sample).

      Note this computation is done for each byte, regardless of bit
      depth.  In a 16-bit image, each MSB is predicted from the
      preceding MSB and each LSB from the preceding LSB, because of the
      way that bpp is defined.

      Unsigned arithmetic modulo 256 is used, so that both the inputs
      and outputs fit into bytes.  The sequence of Sub values is
      transmitted as the filtered scanline.

      For all x < 0, assume Raw(x) = 0.

      To reverse the effect of the Sub filter after decompression,
      output the following value:

         Sub(x) + Raw(x-bpp)

      (computed mod 256), where Raw refers to the bytes already decoded.

   6.4. Filter type 2: Up

      The Up filter is just like the Sub filter except that the pixel
      immediately above the current pixel, rather than just to its left,
      is used as the predictor.

      To compute the Up filter, apply the following formula to each byte
      of the scanline:

         Up(x) = Raw(x) - Prior(x)

RFC2083 - Page 34

      where x ranges from zero to the number of bytes representing the
      scanline minus one, Raw(x) refers to the raw data byte at that
      byte position in the scanline, and Prior(x) refers to the
      unfiltered bytes of the prior scanline.

      Note this is done for each byte, regardless of bit depth.
      Unsigned arithmetic modulo 256 is used, so that both the inputs
      and outputs fit into bytes.  The sequence of Up values is
      transmitted as the filtered scanline.

      On the first scanline of an image (or of a pass of an interlaced
      image), assume Prior(x) = 0 for all x.

      To reverse the effect of the Up filter after decompression, output
      the following value:

         Up(x) + Prior(x)

      (computed mod 256), where Prior refers to the decoded bytes of the
      prior scanline.

   6.5. Filter type 3: Average

      The Average filter uses the average of the two neighboring pixels
      (left and above) to predict the value of a pixel.

      To compute the Average filter, apply the following formula to each
      byte of the scanline:

         Average(x) = Raw(x) - floor((Raw(x-bpp)+Prior(x))/2)

      where x ranges from zero to the number of bytes representing the
      scanline minus one, Raw(x) refers to the raw data byte at that
      byte position in the scanline, Prior(x) refers to the unfiltered
      bytes of the prior scanline, and bpp is defined as for the Sub
      filter.

      Note this is done for each byte, regardless of bit depth.  The
      sequence of Average values is transmitted as the filtered
      scanline.

      The subtraction of the predicted value from the raw byte must be
      done modulo 256, so that both the inputs and outputs fit into
      bytes.  However, the sum Raw(x-bpp)+Prior(x) must be formed
      without overflow (using at least nine-bit arithmetic).  floor()
      indicates that the result of the division is rounded to the next
      lower integer if fractional; in other words, it is an integer
      division or right shift operation.

RFC2083 - Page 35

      For all x < 0, assume Raw(x) = 0.  On the first scanline of an
      image (or of a pass of an interlaced image), assume Prior(x) = 0
      for all x.

      To reverse the effect of the Average filter after decompression,
      output the following value:

         Average(x) + floor((Raw(x-bpp)+Prior(x))/2)

      where the result is computed mod 256, but the prediction is
      calculated in the same way as for encoding.  Raw refers to the
      bytes already decoded, and Prior refers to the decoded bytes of
      the prior scanline.

   6.6. Filter type 4: Paeth

      The Paeth filter computes a simple linear function of the three
      neighboring pixels (left, above, upper left), then chooses as
      predictor the neighboring pixel closest to the computed value.
      This technique is due to Alan W. Paeth [PAETH].

      To compute the Paeth filter, apply the following formula to each
      byte of the scanline:

         Paeth(x) = Raw(x) - PaethPredictor(Raw(x-bpp), Prior(x),
                                            Prior(x-bpp))

      where x ranges from zero to the number of bytes representing the
      scanline minus one, Raw(x) refers to the raw data byte at that
      byte position in the scanline, Prior(x) refers to the unfiltered
      bytes of the prior scanline, and bpp is defined as for the Sub
      filter.

      Note this is done for each byte, regardless of bit depth.
      Unsigned arithmetic modulo 256 is used, so that both the inputs
      and outputs fit into bytes.  The sequence of Paeth values is
      transmitted as the filtered scanline.

RFC2083 - Page 36

      The PaethPredictor function is defined by the following
      pseudocode:

         function PaethPredictor (a, b, c)
         begin
              ; a = left, b = above, c = upper left
              p := a + b - c        ; initial estimate
              pa := abs(p - a)      ; distances to a, b, c
              pb := abs(p - b)
              pc := abs(p - c)
              ; return nearest of a,b,c,
              ; breaking ties in order a,b,c.
              if pa <= pb AND pa <= pc then return a
              else if pb <= pc then return b
              else return c
         end

      The calculations within the PaethPredictor function must be
      performed exactly, without overflow.  Arithmetic modulo 256 is to
      be used only for the final step of subtracting the function result
      from the target byte value.

      Note that the order in which ties are broken is critical and must
      not be altered.  The tie break order is: pixel to the left, pixel
      above, pixel to the upper left.  (This order differs from that
      given in Paeth's article.)

      For all x < 0, assume Raw(x) = 0 and Prior(x) = 0.  On the first
      scanline of an image (or of a pass of an interlaced image), assume
      Prior(x) = 0 for all x.

      To reverse the effect of the Paeth filter after decompression,
      output the following value:

         Paeth(x) + PaethPredictor(Raw(x-bpp), Prior(x), Prior(x-bpp))

      (computed mod 256), where Raw and Prior refer to bytes already
      decoded.  Exactly the same PaethPredictor function is used by both
      encoder and decoder.

7. Chunk Ordering Rules

   To allow new chunk types to be added to PNG, it is necessary to
   establish rules about the ordering requirements for all chunk types.
   Otherwise a PNG editing program cannot know what to do when it
   encounters an unknown chunk.

RFC2083 - Page 37

   We define a "PNG editor" as a program that modifies a PNG file and
   wishes to preserve as much as possible of the ancillary information
   in the file.  Two examples of PNG editors are a program that adds or
   modifies text chunks, and a program that adds a suggested palette to
   a truecolor PNG file.  Ordinary image editors are not PNG editors in
   this sense, because they usually discard all unrecognized information
   while reading in an image.  (Note: we strongly encourage programs
   handling PNG files to preserve ancillary information whenever
   possible.)

   As an example of possible problems, consider a hypothetical new
   ancillary chunk type that is safe-to-copy and is required to appear
   after PLTE if PLTE is present.  If our program to add a suggested
   PLTE does not recognize this new chunk, it may insert PLTE in the
   wrong place, namely after the new chunk.  We could prevent such
   problems by requiring PNG editors to discard all unknown chunks, but
   that is a very unattractive solution.  Instead, PNG requires
   ancillary chunks not to have ordering restrictions like this.

   To prevent this type of problem while allowing for future extension,
   we put some constraints on both the behavior of PNG editors and the
   allowed ordering requirements for chunks.

   7.1. Behavior of PNG editors

      The rules for PNG editors are:

          * When copying an unknown unsafe-to-copy ancillary chunk, a
            PNG editor must not move the chunk relative to any critical
            chunk.  It can relocate the chunk freely relative to other
            ancillary chunks that occur between the same pair of
            critical chunks.  (This is well defined since the editor
            must not add, delete, modify, or reorder critical chunks if
            it is preserving unknown unsafe-to-copy chunks.)
          * When copying an unknown safe-to-copy ancillary chunk, a PNG
            editor must not move the chunk from before IDAT to after
            IDAT or vice versa.  (This is well defined because IDAT is
            always present.)  Any other reordering is permitted.
          * When copying a known ancillary chunk type, an editor need
            only honor the specific chunk ordering rules that exist for
            that chunk type.  However, it can always choose to apply the
            above general rules instead.
          * PNG editors must give up on encountering an unknown critical
            chunk type, because there is no way to be certain that a
            valid file will result from modifying a file containing such
            a chunk.  (Note that simply discarding the chunk is not good
            enough, because it might have unknown implications for the
            interpretation of other chunks.)

RFC2083 - Page 38

      These rules are expressed in terms of copying chunks from an input
      file to an output file, but they apply in the obvious way if a PNG
      file is modified in place.

      See also Chunk naming conventions (Section 3.3).

   7.2. Ordering of ancillary chunks

      The ordering rules for an ancillary chunk type cannot be any
      stricter than this:

          * Unsafe-to-copy chunks can have ordering requirements
            relative to critical chunks.
          * Safe-to-copy chunks can have ordering requirements relative
            to IDAT.

      The actual ordering rules for any particular ancillary chunk type
      may be weaker.  See for example the ordering rules for the
      standard ancillary chunk types (Summary of standard chunks,
      Section 4.3).

      Decoders must not assume more about the positioning of any
      ancillary chunk than is specified by the chunk ordering rules.  In
      particular, it is never valid to assume that a specific ancillary
      chunk type occurs with any particular positioning relative to
      other ancillary chunks.  (For example, it is unsafe to assume that
      your private ancillary chunk occurs immediately before IEND.  Even
      if your application always writes it there, a PNG editor might
      have inserted some other ancillary chunk after it.  But you can
      safely assume that your chunk will remain somewhere between IDAT
      and IEND.)

   7.3. Ordering of critical chunks

      Critical chunks can have arbitrary ordering requirements, because
      PNG editors are required to give up if they encounter unknown
      critical chunks.  For example, IHDR has the special ordering rule
      that it must always appear first.  A PNG editor, or indeed any
      PNG-writing program, must know and follow the ordering rules for
      any critical chunk type that it can emit.

RFC2083 - Page 39

8. Miscellaneous Topics

   8.1. File name extension

      On systems where file names customarily include an extension
      signifying file type, the extension ".png" is recommended for PNG
      files.  Lower case ".png" is preferred if file names are case-
      sensitive.

   8.2. Internet media type

      The Internet Assigned Numbers Authority (IANA) has registered
      "image/png" as the Internet Media Type for PNG [RFC-2045, RFC-
      2048].  For robustness, decoders may choose to also support the
      interim media type "image/x-png" which was in use before
      registration was complete.

   8.3. Macintosh file layout

      In the Apple Macintosh system, the following conventions are
      recommended:

          * The four-byte file type code for PNG files is "PNGf".  (This
            code has been registered with Apple for PNG files.) The
            creator code will vary depending on the creating
            application.
          * The contents of the data fork must be a PNG file exactly as
            described in the rest of this specification.
          * The contents of the resource fork are unspecified.  It may
            be empty or may contain application-dependent resources.
          * When transferring a Macintosh PNG file to a non-Macintosh
            system, only the data fork should be transferred.

   8.4. Multiple-image extension

      PNG itself is strictly a single-image format.  However, it may be
      necessary to store multiple images within one file; for example,
      this is needed to convert some GIF files.  In the future, a
      multiple-image format based on PNG may be defined.  Such a format
      will be considered a separate file format and will have a
      different signature.  PNG-supporting applications may or may not
      choose to support the multiple-image format.

      See Rationale: Why not these features? (Section 12.3).

RFC2083 - Page 40

   8.5. Security considerations

      A PNG file or datastream is composed of a collection of explicitly
      typed "chunks".  Chunks whose contents are defined by the
      specification could actually contain anything, including malicious
      code.  But there is no known risk that such malicious code could
      be executed on the recipient's computer as a result of decoding
      the PNG image.

      The possible security risks associated with future chunk types
      cannot be specified at this time.  Security issues will be
      considered when evaluating chunks proposed for registration as
      public chunks.  There is no additional security risk associated
      with unknown or unimplemented chunk types, because such chunks
      will be ignored, or at most be copied into another PNG file.

      The tEXt and zTXt chunks contain data that is meant to be
      displayed as plain text.  It is possible that if the decoder
      displays such text without filtering out control characters,
      especially the ESC (escape) character, certain systems or
      terminals could behave in undesirable and insecure ways.  We
      recommend that decoders filter out control characters to avoid
      this risk; see Recommendations for Decoders: Text chunk processing
      (Section 10.11).

      Because every chunk's length is available at its beginning, and
      because every chunk has a CRC trailer, there is a very robust
      defense against corrupted data and against fraudulent chunks that
      attempt to overflow the decoder's buffers.  Also, the PNG
      signature bytes provide early detection of common file
      transmission errors.

      A decoder that fails to check CRCs could be subject to data
      corruption.  The only likely consequence of such corruption is
      incorrectly displayed pixels within the image.  Worse things might
      happen if the CRC of the IHDR chunk is not checked and the width
      or height fields are corrupted.  See Recommendations for Decoders:
      Error checking (Section 10.1).

      A poorly written decoder might be subject to buffer overflow,
      because chunks can be extremely large, up to (2^31)-1 bytes long.
      But properly written decoders will handle large chunks without
      difficulty.

(next page on part 3)