Appendix B. Insertion Encoding Instruction Examples
This appendix is non-normative. This appendix contains examples showing the use of insertion encoding instructions to remove extension ambiguity arising from use of the GROUP encoding instruction.B.1. Example 1
Consider the following type definition: SEQUENCE { one [GROUP] SEQUENCE { two UTF8String, ... -- Extension insertion point (I1). }, three INTEGER OPTIONAL, ... -- Extension insertion point (I2). } The associated grammar is: P1: S ::= one three I2 P2: one ::= two I1 P3: two ::= "two" P4: I1 ::= "*" I1 P5: I1 ::= P6: three ::= "three" P7: three ::= P8: I2 ::= "*" I2 P9: I2 ::= This grammar leads to the following sets and predicates: First(P4) = { "*" } First(P5) = { } Preselected(P4) = Preselected(P5) = Empty(P4) = false Empty(P5) = true Follow(I1) = { "three", "*", "$" } Select(P4) = First(P4) = { "*" } Select(P5) = First(P5) + Follow(I1) = { "three", "*", "$" }
First(P6) = { "three" } First(P7) = { } Preselected(P6) = Preselected(P7) = Empty(P6) = false Empty(P7) = true Follow(three) = { "*", "$" } Select(P6) = First(P6) = { "three" } Select(P7) = First(P7) + Follow(three) = { "*", "$" } First(P8) = { "*" } First(P9) = { } Preselected(P8) = Preselected(P9) = Empty(P8) = false Empty(P9) = true Follow(I2) = { "$" } Select(P8) = First(P8) = { "*" } Select(P9) = First(P9) + Follow(I2) = { "$" } The intersection of Select(P4) and Select(P5) is not empty; hence, the grammar is not deterministic, and the type definition is not valid. If an RXER decoder encounters an unrecognized element immediately after a <two> element, then it will not know whether to associate it with extension insertion point I1 or I2. The non-determinism can be resolved with either a NO-INSERTIONS or HOLLOW-INSERTIONS encoding instruction. Consider this revised type definition: SEQUENCE { one [GROUP] [HOLLOW-INSERTIONS] SEQUENCE { two UTF8String, ... -- Extension insertion point (I1). }, three INTEGER OPTIONAL, ... -- Extension insertion point (I2). } The associated grammar is: P1: S ::= one three I2 P10: one ::= two P3: two ::= "two" P6: three ::= "three" P7: three ::= P8: I2 ::= "*" I2 P9: I2 ::= With the addition of the HOLLOW-INSERTIONS encoding instruction, the P4 and P5 productions are no longer generated, and the conflict between Select(P4) and Select(P5) no longer exists. The Select Sets
for P6, P7, P8, and P9 are unchanged. A decoder will now assume that an unrecognized element is to be associated with extension insertion point I2. It is still free to associate an unrecognized attribute with either extension insertion point. If a NO-INSERTIONS encoding instruction had been used, then an unrecognized attribute could only be associated with extension insertion point I2. The non-determinism could also be resolved by adding a NO-INSERTIONS or HOLLOW-INSERTIONS encoding instruction to the outer SEQUENCE: [HOLLOW-INSERTIONS] SEQUENCE { one [GROUP] SEQUENCE { two UTF8String, ... -- Extension insertion point (I1). }, three INTEGER OPTIONAL, ... -- Extension insertion point (I2). } The associated grammar is: P11: S ::= one three P2: one ::= two I1 P3: two ::= "two" P4: I1 ::= "*" I1 P5: I1 ::= P6: three ::= "three" P7: three ::= This grammar leads to the following sets and predicates: First(P4) = { "*" } First(P5) = { } Preselected(P4) = Preselected(P5) = Empty(P4) = false Empty(P5) = true Follow(I1) = { "three", "$" } Select(P4) = First(P4) = { "*" } Select(P5) = First(P5) + Follow(I1) = { "three", "$" } First(P6) = { "three" } First(P7) = { } Preselected(P6) = Preselected(P7) = Empty(P6) = false Empty(P7) = true Follow(three) = { "$" } Select(P6) = First(P6) = { "three" } Select(P7) = First(P7) + Follow(three) = { "$" }
The intersection of Select(P4) and Select(P5) is empty, as is the intersection of Select(P6) and Select(P7); hence, the grammar is deterministic, and the type definition is valid. A decoder will now assume that an unrecognized element is to be associated with extension insertion point I1. It is still free to associate an unrecognized attribute with either extension insertion point. If a NO-INSERTIONS encoding instruction had been used, then an unrecognized attribute could only be associated with extension insertion point I1.B.2. Example 2
Consider the following type definition: SEQUENCE { one [GROUP] CHOICE { two UTF8String, ... -- Extension insertion point (I1). } OPTIONAL } The associated grammar is: P1: S ::= one P2: one ::= two P3: one ::= I1 P4: one ::= P5: two ::= "two" P6: I1 ::= "*" I1 P7: I1 ::= This grammar leads to the following sets and predicates: First(P2) = { "two" } First(P3) = { "*" } First(P4) = { } Preselected(P2) = Preselected(P3) = Preselected(P4) = false Empty(P2) = false Empty(P3) = Empty(P4) = true Follow(one) = { "$" } Select(P2) = First(P2) = { "two" } Select(P3) = First(P3) + Follow(one) = { "*", "$" } Select(P4) = First(P4) + Follow(one) = { "$" } First(P6) = { "*" } First(P7) = { } Preselected(P6) = Preselected(P7) = Empty(P6) = false Empty(P7) = true
Follow(I1) = { "$" } Select(P6) = First(P6) = { "*" } Select(P7) = First(P7) + Follow(I1) = { "$" } The intersection of Select(P3) and Select(P4) is not empty; hence, the grammar is not deterministic, and the type definition is not valid. If the <two> element is not present, then a decoder cannot determine whether the "one" alternative is absent, or present with an unknown extension that generates no elements. The non-determinism can be resolved with either a SINGULAR-INSERTIONS, UNIFORM-INSERTIONS, or MULTIFORM-INSERTIONS encoding instruction. The MULTIFORM-INSERTIONS encoding instruction is the least restrictive. Consider this revised type definition: SEQUENCE { one [GROUP] [MULTIFORM-INSERTIONS] CHOICE { two UTF8String, ... -- Extension insertion point (I1). } OPTIONAL } The associated grammar is: P1: S ::= one P2: one ::= two P8: one ::= "*" I1 P4: one ::= P5: two ::= "two" P6: I1 ::= "*" I1 P7: I1 ::= This grammar leads to the following sets and predicates: First(P2) = { "two" } First(P8) = { "*" } First(P4) = { } Preselected(P2) = Preselected(P8) = Preselected(P4) = false Empty(P2) = Empty(P8) = false Empty(P4) = true Follow(one) = { "$" } Select(P2) = First(P2) = { "two" } Select(P8) = First(P8) = { "*" } Select(P4) = First(P4) + Follow(one) = { "$" } First(P6) = { "*" } First(P7) = { } Preselected(P6) = Preselected(P7) = Empty(P6) = false
Empty(P7) = true Follow(I1) = { "$" } Select(P6) = First(P6) = { "*" } Select(P7) = First(P7) + Follow(I1) = { "$" } The intersection of Select(P2) and Select(P8) is empty, as is the intersection of Select(P2) and Select(P4), the intersection of Select(P8) and Select(P4), and the intersection of Select(P6) and Select(P7); hence, the grammar is deterministic, and the type definition is valid. A decoder will now assume the "one" alternative is present if it sees at least one unrecognized element, and absent otherwise.B.3. Example 3
Consider the following type definition: SEQUENCE { one [GROUP] CHOICE { two UTF8String, ... -- Extension insertion point (I1). }, three [GROUP] CHOICE { four UTF8String, ... -- Extension insertion point (I2). } } The associated grammar is: P1: S ::= one three P2: one ::= two P3: one ::= I1 P4: two ::= "two" P5: I1 ::= "*" I1 P6: I1 ::= P7: three ::= four P8: three ::= I2 P9: four ::= "four" P10: I2 ::= "*" I2 P11: I2 ::= This grammar leads to the following sets and predicates: First(P2) = { "two" } First(P3) = { "*" } Preselected(P2) = Preselected(P3) = Empty(P2) = false Empty(P3) = true
Follow(one) = { "four", "*", "$" } Select(P2) = First(P2) = { "two" } Select(P3) = First(P3) + Follow(one) = { "*", "four", "$" } First(P5) = { "*" } First(P6) = { } Preselected(P5) = Preselected(P6) = Empty(P5) = false Empty(P6) = true Follow(I1) = { "four", "*", "$" } Select(P5) = First(P5) = { "*" } Select(P6) = First(P6) + Follow(I1) = { "four", "*", "$" } First(P7) = { "four" } First(P8) = { "*" } Preselected(P7) = Preselected(P8) = Empty(P7) = false Empty(P8) = true Follow(three) = { "$" } Select(P7) = First(P7) = { "four" } Select(P8) = First(P8) + Follow(three) = { "*", "$" } First(P10) = { "*" } First(P11) = { } Preselected(P10) = Preselected(P11) = Empty(P10) = false Empty(P11) = true Follow(I2) = { "$" } Select(P10) = First(P10) = { "*" } Select(P11) = First(P11) + Follow(I2) = { "$" } The intersection of Select(P5) and Select(P6) is not empty; hence, the grammar is not deterministic, and the type definition is not valid. If the first child element is an unrecognized element, then a decoder cannot determine whether to associate it with extension insertion point I1, or to associate it with extension insertion point I2 by assuming that the "one" component has an unknown extension that generates no elements. The non-determinism can be resolved with either a SINGULAR-INSERTIONS or UNIFORM-INSERTIONS encoding instruction. Consider this revised type definition using the SINGULAR-INSERTIONS encoding instruction:
SEQUENCE { one [GROUP] [SINGULAR-INSERTIONS] CHOICE { two UTF8String, ... -- Extension insertion point (I1). }, three [GROUP] CHOICE { four UTF8String, ... -- Extension insertion point (I2). } } The associated grammar is: P1: S ::= one three P2: one ::= two P12: one ::= "*" P4: two ::= "two" P7: three ::= four P8: three ::= I2 P9: four ::= "four" P10: I2 ::= "*" I2 P11: I2 ::= With the addition of the SINGULAR-INSERTIONS encoding instruction, the P5 and P6 productions are no longer generated. The grammar leads to the following sets and predicates for the P2 and P12 productions: First(P2) = { "two" } First(P12) = { "*" } Preselected(P2) = Preselected(P12) = false Empty(P2) = Empty(P12) = false Follow(one) = { "four", "*", "$" } Select(P2) = First(P2) = { "two" } Select(P12) = First(P12) = { "*" } The sets for P5 and P6 are no longer generated, and the remaining sets are unchanged. The intersection of Select(P2) and Select(P12) is empty, as is the intersection of Select(P7) and Select(P8) and the intersection of Select(P10) and Select(P11); hence, the grammar is deterministic, and the type definition is valid. If the first child element is an unrecognized element, then a decoder will now assume that it is associated with extension insertion point I1. Whatever follows, possibly including another unrecognized element, will belong to the "three" component.
Now consider the type definition using the UNIFORM-INSERTIONS encoding instruction instead: SEQUENCE { one [GROUP] [UNIFORM-INSERTIONS] CHOICE { two UTF8String, ... -- Extension insertion point (I1). }, three [GROUP] CHOICE { four UTF8String, ... -- Extension insertion point (I2). } } The associated grammar is: P1: S ::= one three P2: one ::= two P13: one ::= "*" P14: one ::= "*1" I1 P4: two ::= "two" P15: I1 ::= "*1" I1 P6: I1 ::= P7: three ::= four P8: three ::= I2 P9: four ::= "four" P10: I2 ::= "*" I2 P11: I2 ::= This grammar leads to the following sets and predicates for the P2, P13, P14, P15, and P6 productions: First(P2) = { "two" } First(P13) = { "*" } First(P14) = { "*1" } Preselected(P2) = Preselected(P13) = Preselected(P14) = false Empty(P2) = Empty(P13) = Empty(P14) = false Follow(one) = { "four", "*", "$" } Select(P2) = First(P2) = { "two" } Select(P13) = First(P13) = { "*" } Select(P14) = First(P14) = { "*1" }
First(P15) = { "*1" } First(P6) = { } Preselected(P15) = Preselected(P6) = Empty(P15) = false Empty(P6) = true Follow(I1) = { "four", "*", "$" } Select(P15) = First(P15) = { "*1" } Select(P6) = First(P6) + Follow(I1) = { "four", "*", "$" } The remaining sets are unchanged. The intersection of Select(P2) and Select(P13) is empty, as is the intersection of Select(P2) and Select(P14), the intersection of Select(P13) and Select(P14) and the intersection of Select(P15) and Select(P6); hence, the grammar is deterministic, and the type definition is valid. If the first child element is an unrecognized element, then a decoder will now assume that it and every subsequent unrecognized element with the same name are associated with I1. Whatever follows, possibly including another unrecognized element with a different name, will belong to the "three" component. A consequence of using the UNIFORM-INSERTIONS encoding instruction is that any future extension to the "three" component will be required to generate elements with names that are different from the names of the elements generated by the "one" component. With the SINGULAR-INSERTIONS encoding instruction, extensions to the "three" component are permitted to generate elements with names that are the same as the names of the elements generated by the "one" component.B.4. Example 4
Consider the following type definition: SEQUENCE OF one [GROUP] CHOICE { two UTF8String, ... -- Extension insertion point (I1). } The associated grammar is: P1: S ::= one S P2: S ::= P3: one ::= two P4: one ::= I1 P5: two ::= "two" P6: I1 ::= "*" I1 P7: I1 ::=
This grammar leads to the following sets and predicates: First(P1) = { "two", "*" } First(P2) = { } Preselected(P1) = Preselected(P2) = false Empty(P1) = Empty(P2) = true Follow(S) = { "$" } Select(P1) = First(P1) + Follow(S) = { "two", "*", "$" } Select(P2) = First(P2) + Follow(S) = { "$" } First(P3) = { "two" } First(P4) = { "*" } Preselected(P3) = Preselected(P4) = Empty(P3) = false Empty(P4) = true Follow(one) = { "two", "*", "$" } Select(P3) = First(P3) = { "two" } Select(P4) = First(P4) + Follow(one) = { "*", "two", "$" } First(P6) = { "*" } First(P7) = { } Preselected(P6) = Preselected(P7) = Empty(P6) = false Empty(P7) = true Follow(I1) = { "two", "*", "$" } Select(P6) = First(P6) = { "*" } Select(P7) = First(P7) + Follow(I1) = { "two", "*", "$" } The intersection of Select(P1) and Select(P2) is not empty, as is the intersection of Select(P3) and Select(P4) and the intersection of Select(P6) and Select(P7); hence, the grammar is not deterministic, and the type definition is not valid. If a decoder encounters two or more unrecognized elements in a row, then it cannot determine whether this represents one instance or more than one instance of the "one" component. Even without unrecognized elements, there is still a problem that an encoding could contain an indeterminate number of "one" components using an extension that generates no elements. The non-determinism cannot be resolved with a UNIFORM-INSERTIONS encoding instruction. Consider this revised type definition using the UNIFORM-INSERTIONS encoding instruction: SEQUENCE OF one [GROUP] [UNIFORM-INSERTIONS] CHOICE { two UTF8String, ... -- Extension insertion point (I1). }
The associated grammar is: P1: S ::= one S P2: S ::= P3: one ::= two P8: one ::= "*" P9: one ::= "*1" I1 P5: two ::= "two" P10: I1 ::= "*1" I1 P7: I1 ::= This grammar leads to the following sets and predicates: First(P1) = { "two", "*", "*1" } First(P2) = { } Preselected(P1) = Preselected(P2) = Empty(P1) = false Empty(P2) = true Follow(S) = { "$" } Select(P1) = First(P1) = { "two", "*", "*1" } Select(P2) = First(P2) + Follow(S) = { "$" } First(P3) = { "two" } First(P8) = { "*" } First(P9) = { "*1" } Preselected(P3) = Preselected(P8) = Preselected(P9) = false Empty(P3) = Empty(P8) = Empty(P9) = false Follow(one) = { "two", "*", "*1", "$" } Select(P3) = First(P3) = { "two" } Select(P8) = First(P8) = { "*" } Select(P9) = First(P9) = { "*1" } First(P10) = { "*1" } First(P7) = { } Preselected(P10) = Preselected(P7) = Empty(P10) = false Empty(P7) = true Follow(I1) = { "two", "*", "*1", "$" } Select(P10) = First(P10) = { "*1" } Select(P7) = First(P7) + Follow(I1) = { "two", "*", "*1", "$" } The intersection of Select(P1) and Select(P2) is now empty, but the intersection of Select(P10) and Select(P7) is not; hence, the grammar is not deterministic, and the type definition is not valid. The problem of an indeterminate number of "one" components from an extension that generates no elements has been solved. However, if a decoder encounters a series of elements with the same name, it cannot determine whether this represents one instance or more than one instance of the "one" component.
The non-determinism can be fully resolved with a SINGULAR-INSERTIONS encoding instruction. Consider this revised type definition: SEQUENCE OF one [GROUP] [SINGULAR-INSERTIONS] CHOICE { two UTF8String, ... -- Extension insertion point (I1). } The associated grammar is: P1: S ::= one S P2: S ::= P3: one ::= two P8: one ::= "*" P5: two ::= "two" This grammar leads to the following sets and predicates: First(P1) = { "two", "*" } First(P2) = { } Preselected(P1) = Preselected(P2) = Empty(P1) = false Empty(P2) = true Follow(S) = { "$" } Select(P1) = First(P1) = { "two", "*" } Select(P2) = First(P2) + Follow(S) = { "$" } First(P3) = { "two" } First(P8) = { "*" } Preselected(P3) = Preselected(P8) = false Empty(P3) = Empty(P8) = false Follow(one) = { "two", "*", "$" } Select(P3) = First(P3) = { "two" } Select(P8) = First(P8) = { "*" } The intersection of Select(P1) and Select(P2) is empty, as is the intersection of Select(P3) and Select(P8); hence, the grammar is deterministic, and the type definition is valid. A decoder now knows that every extension to the "one" component will generate a single element, so the correct number of "one" components will be decoded.
Appendix C. Extension and Versioning Examples
This appendix is non-normative.C.1. Valid Extensions for Insertion Encoding Instructions
The first example shows extensions that satisfy the HOLLOW-INSERTIONS encoding instruction. [HOLLOW-INSERTIONS] CHOICE { one BOOLEAN, ..., two [ATTRIBUTE] INTEGER, three [GROUP] SEQUENCE { four [ATTRIBUTE] UTF8String, five [ATTRIBUTE] INTEGER OPTIONAL, ... }, six [GROUP] CHOICE { seven [ATTRIBUTE] BOOLEAN, eight [ATTRIBUTE] INTEGER } } The "two" and "six" components generate only attributes. The "three" component in its current form does not generate elements. Any extension to the "three" component will need to do likewise to avoid breaking forward compatibility. The second example shows extensions that satisfy the SINGULAR-INSERTIONS encoding instruction. [SINGULAR-INSERTIONS] CHOICE { one BOOLEAN, ..., two INTEGER, three [GROUP] SEQUENCE { four [ATTRIBUTE] UTF8String, five INTEGER }, six [GROUP] CHOICE { seven BOOLEAN, eight INTEGER } }
The "two" component will always generate a single <two> element. The "three" component will always generate a single <five> element. It will also generate a "four" attribute, but any number of attributes is allowed by the SINGULAR-INSERTIONS encoding instruction. The "six" component will either generate a single <seven> element or a single <eight> element. Either case will satisfy the requirement that there will be a single element in any given encoding of the extension. The third example shows extensions that satisfy the UNIFORM-INSERTIONS encoding instruction. [UNIFORM-INSERTIONS] CHOICE { one BOOLEAN, ..., two INTEGER, three [GROUP] SEQUENCE SIZE(1..MAX) OF four INTEGER, five [GROUP] SEQUENCE { six [ATTRIBUTE] UTF8String OPTIONAL, seven INTEGER }, eight [GROUP] CHOICE { nine BOOLEAN, ten [GROUP] SEQUENCE SIZE(1..MAX) OF eleven INTEGER } } The "two" component will always generate a single <two> element. The "three" component will always generate one or more <four> elements. The "five" component will always generate a single <seven> element. It may also generate a "six" attribute, but any number of attributes is allowed by the UNIFORM-INSERTIONS encoding instruction. The "eight" component will either generate a single <nine> element or one or more <eleven> elements. Either case will satisfy the requirement that there must be one or more elements with the same name in any given encoding of the extension.
C.2. Versioning Example
Making extensions that are not forward compatible is permitted provided that the incompatibility is signalled with a version indicator attribute. Suppose that version 1.0 of a specification contains the following type definition: MyMessageType ::= SEQUENCE { version [ATTRIBUTE] [VERSION-INDICATOR] UTF8String ("1.0", ...) DEFAULT "1.0", one [GROUP] [SINGULAR-INSERTIONS] CHOICE { two BOOLEAN, ... }, ... } An attribute is to be added to the CHOICE for version 1.1. This change is not forward compatible since it does not satisfy the SINGULAR-INSERTIONS encoding instruction. Therefore, the version indicator attribute must be updated at the same time (or added if it wasn't already present). This results in the following new type definition for version 1.1: MyMessageType ::= SEQUENCE { version [ATTRIBUTE] [VERSION-INDICATOR] UTF8String ("1.0", ..., "1.1") DEFAULT "1.0", one [GROUP] [SINGULAR-INSERTIONS] CHOICE { two BOOLEAN, ..., three [ATTRIBUTE] INTEGER -- Added in Version 1.1 }, ... } If a version 1.1 conformant application hasn't used the version 1.1 extension in a value of MyMessageType, then it is allowed to set the value of the version attribute to "1.0". A pair of elements is added to the CHOICE for version 1.2. Again the change does not satisfy the SINGULAR-INSERTIONS encoding instruction. The type definition for version 1.2 is:
MyMessageType ::= SEQUENCE { version [ATTRIBUTE] [VERSION-INDICATOR] UTF8String ("1.0", ..., "1.1" | "1.2") DEFAULT "1.0", one [GROUP] [SINGULAR-INSERTIONS] CHOICE { two BOOLEAN, ..., three [ATTRIBUTE] INTEGER, -- Added in Version 1.1 four [GROUP] SEQUENCE { five UTF8String, six GeneralizedTime } -- Added in version 1.2 }, ... } If a version 1.2 conformant application hasn't used the version 1.2 extension in a value of MyMessageType, then it is allowed to set the value of the version attribute to "1.1". If it hasn't used either of the extensions, then it is allowed to set the value of the version attribute to "1.0".Author's Address
Dr. Steven Legg eB2Bcom Suite 3, Woodhouse Corporate Centre 935 Station Street Box Hill North, Victoria 3129 AUSTRALIA Phone: +61 3 9896 7830 Fax: +61 3 9896 7801 EMail: steven.legg@eb2bcom.com
Full Copyright Statement Copyright (C) The IETF Trust (2007). This document is subject to the rights, licenses and restrictions contained in BCP 78, and except as set forth therein, the authors retain all their rights. This document and the information contained herein are provided on an "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY, THE IETF TRUST AND THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Intellectual Property The IETF takes no position regarding the validity or scope of any Intellectual Property Rights or other rights that might be claimed to pertain to the implementation or use of the technology described in this document or the extent to which any license under such rights might or might not be available; nor does it represent that it has made any independent effort to identify any such rights. Information on the procedures with respect to rights in RFC documents can be found in BCP 78 and BCP 79. Copies of IPR disclosures made to the IETF Secretariat and any assurances of licenses to be made available, or the result of an attempt made to obtain a general license or permission for the use of such proprietary rights by implementers or users of this specification can be obtained from the IETF on-line IPR repository at http://www.ietf.org/ipr. The IETF invites any interested party to bring to its attention any copyrights, patents or patent applications, or other proprietary rights that may cover technology that may be required to implement this standard. Please address the information to the IETF at ietf-ipr@ietf.org. Acknowledgement Funding for the RFC Editor function is currently provided by the Internet Society.