Internationalization Text Proposal for Extensions

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view

Internationalization Text Proposal for Extensions

Jeffrey Altman
The following is a repost of text developed by Jeffrey Hutzelman and
myself for Extensions.

Internationalization of the Kerberos 5 protocol for use in [@EXTENSIONS]

Guiding Principals

Kerberos 5 is a wire protocol which is designed to be independent of operating
system, networking protocols, and naming services.  The goal is to provide the
utmost flexibility in the use of characters while specifying the necessary
restrictions to ensure future extensions to the protocol can operate in a
backward compatible manner.

It must be clearly recognized that the Kerberos protocol has no means to
identify the source or meaning of principal name components.  Therefore, the
protocol must specify a single set of rules for the use of preparing and
interpreting internationalized strings.  Although when Kerberos is used on
the Internet common practice is to generate principals utilizing components
derived from DNS names, it is not possible for the Kerberos protocol to
recognize this.  

There are three ways strings can be used within a protocol: storing, querying,
and displaying.  Each of these uses of a string requires different properties.  

A "storage string" is a string transmitted within the protocol for the explicit
purpose of being stored into a database for subsequent comparisons during query

A "query string" is a string used for comparison against strings stored in a

Neither "storage strings" nor "query strings" are meant for visual display to
an end-user.  The only strings in a protocol which are meant for end-user
interaction are "display strings".                                                

NOTE: the Kerberos protocol does not utilize any "storage strings".  This is
because the creation of principals, how they are stored in a database, and
how passwords are set or changed is outside the scope of the protocol.  These
decisions are left up to implementors and administrators.  

NOTE: A credential cache of tickets is not a database

There has been much discussion of the need for ensuring that Kerberos strings
are compatible with the internationalized strings used within other protocols
(e.g., internationalized domain names.)  The demand for this functionality is
derived from a failure on the part of other protocol designers to properly
enforce the distinctions between strings used for storage, queries, and
end-user display.  In particular, IDNA [RFC3490] specifically requires the
leakage of storage and query string for use as display strings.  This leakage
has the side-effect of imposing the IDNA String Preparation rules onto some
potential sources of input when applications generate strings for use in the
Kerberos protocol.  To ensure compatibility between applications which accept
IDN wire representations as input and those which do not, it will be necessary
for there to be "best common practices" specified as recommendations for
implementors and administrators to use.  However, these BCPs must not be
restrictions imposed on the Kerberos protocol because the Kerberos string
space is not equivalent to the IDN string space.  In fact, it is much broader.


   The original specification of the Kerberos protocol in RFC 1510 uses
   GeneralString in numerous places for human-readable string data.
   Historical implementations of Kerberos cannot utilize the full power
   of GeneralString.  This ASN.1 type requires the use of designation
   and invocation escape sequences as specified in ISO-2022/ECMA-35
   [ISO-2022/ECMA-35] to switch character sets, and the default
   character set that is designated as G0 is the ISO-646/ECMA-6
   [ISO-646,ECMA-6] International Reference Version (IRV) (aka U.S.
   ASCII), which mostly works.

   ISO-2022/ECMA-35 defines four character-set code elements (G0..G3)
   and two Control-function code elements (C0..C1). DER prohibits the
   designation of character sets as any but the G0 and C0 sets.
   Unfortunately, this seems to have the side effect of prohibiting the
   use of ISO-8859 (ISO Latin) [ISO-8859] character-sets or any other
   character-sets that utilize a 96-character set, since it is
   prohibited by ISO-2022/ECMA-35 to designate them as the G0 code
   element. This side effect is being investigated in the ASN.1
   standards community.

   In practice, many implementations treat GeneralStrings as if they
   were 8-bit strings of whichever character set the implementation
   defaults to, without regard for correct usage of character-set
   designation escape sequences. The default character set is often
   determined by the current user's operating system dependent locale.
   At least one major implementation places unescaped UTF-8 encoded
   Unicode characters in the GeneralString. This failure to adhere to
   the GeneralString specifications results in interoperability issues
   when conflicting character encodings are utilized by the Kerberos
   clients, services, and KDC.

   This unfortunate situation is the result of improper documentation of
   the restrictions of the ASN.1 GeneralString type in prior Kerberos

   In [@CLARIFICATIONS] the type KerberosString, defined below, was a
   GeneralString that is constrained to only contain characters in

      KerberosString  ::= GeneralString (IA5String)

   US-ASCII control characters should in general not be used in
   KerberosString, except for cases such as newlines in lengthy error
   messages. Control characters SHOULD NOT be used in principal names or
   realm names.

   For compatibility, implementations MAY choose to accept GeneralString
   values that contain characters other than those permitted by
   IA5String, but they should be aware that character set designation
   codes will likely be absent, and that the encoding should probably be
   treated as locale-specific in almost every way. Implementations MAY
   also choose to emit GeneralString values that are beyond those
   permitted by IA5String, but should be aware that doing so is
   extraordinarily risky from an interoperability perspective.

   Some existing implementations use GeneralString to encode unescaped
   locale-specific characters. This is a violation of the ASN.1
   standard. Most of these implementations encode US-ASCII in the left-
   hand half, so as long the implementation transmits only US-ASCII, the
   ASN.1 standard is not violated in this regard. As soon as such an
   implementation encodes unescaped locale-specific characters with the
   high bit set, it violates the ASN.1 standard.

   Other implementations have been known to use GeneralString to contain
   a UTF-8 encoding. This also violates the ASN.1 standard, since UTF-8
   is a different encoding, not a 94 or 96 character "G" set as defined
   by ISO 2022.  It is believed that these implementations do not even
   use the ISO 2022 escape sequence to change the character encoding.
   Even if implementations were to announce the change of encoding by
   using that escape sequence, the ASN.1 standard prohibits the use of
   any escape sequences other than those used to designate/invoke "G" or
   "C" sets allowed by GeneralString.

   In [@EXTENSIONS] the type KerberosString, defined below, is extended to
   support internationalized strings:

        -- used for names and for error messages
            KerberosString ::= CHOICE {
            ia5 GeneralString (IA5String),
            utf8 UTF8String,  -- normalized
            ... -- no extension may be sent
                            -- to an rfc1510 implementation --

   Note that applying a new constraint to a previously unconstrained
   type constitutes creation of a new ASN.1 type. In this particular
   case, the change does not result in a changed encoding under DER.
   This allows for a redefinition of the KerberosString type as a
   migration strategy provided that appropriate constraints are applied
   to restrict the KerberosString to either IA5String or UTF8String
   as appropriate to protocol messages defined in [@CLARIFICATIONS]
   or newly defined in this protocol.

NOTE FOR ASN.1: The comment that utf8 is normalized should be removed.
KerberosStrings used for display are not normalized although they must
be validated for shortest form utf8.

Normalization Requirements of I18N KerberosStrings

   KerberosStrings are used within the Kerberos protocol to represent Realms,
   PrincipalNames, Salts, and Error Text.  These strings can be used in one
   of three ways.   As part of query/response, as input to a storage device, or
   as output intended for display purposes only.  Different string utilizations
   have different requirements for string preparation.  

   The requirements for string preparation are described in "Preparation of
   Internationalized Strings (stringprep)" [RFC3454].  In brief, a string
   preparation process attempts to ensure that two strings entered at
   different times via different user interfaces which are intended
   to be identical will be recognized as such.  This requirement is especially
   true of a complex character-set such as Unicode which allows for multiple
   sequences which are logically identical.

   The following String Preparation profile will be used by the Kerberos
   protocol for preparing Kerberos strings:

     * SASLprep [@SASLPREP]


    KerberosQueryString := KerberosString (UTF8: sender prepared)

    Kerberos Query strings are prepared with SASLprep on the sender and are
    not otherwise modified     by the receiver.  This permits a Kerberos sender
    utilizing a newer StringPrep profile than the receiver to transmit query
    strings containing code points unassigned in the version of SASLprep
    utilized by the receiver.  The result of the use of such a string when
    querying against a Kerberos database will simply be a failure to match.

    KerberosDisplayString := KerberosString

    A Display string is neither used for Querying against a database or for
    storing into a database.  As such there are no canonicalization
    requirements on display strings, string preparation is not performed on
    strings sent for the purpose of display to the end user.  In
    general, Display Strings are only used to represent Error Text within the
    the Kerberos protocol

    (?) Perhaps we should recommend that KerberosDisplay strings when being
    logged to a file or other storage medium SHOULD be SASLprep'd prior to


    KerberosStorageString := KerberosString (UTF8: sender and receiver prepared)

    Kerberos Storage Strings are prepared with SASLprep by the sender.  As this
    string is meant to be stored for long time use there is a requirement that the
    string be recognizable to both the sender and the receiver.  

    The send must prepare the string to verify that the string contains no
    unassigned code points according to its version of the string prep
    algorithm.  If an unassigned code point is detected an error should be
    returned.  The sender then transmits the unprepared string to the receiver.  

    The receiver of the string must prepare the string and ensure that the
    received string contains no unassigned code points in its version of the
    string prep algorithm.  If the receiver detects unassigned code points the
    string an error message: KRB_ERR_INCOMPATIBLE_STRINGPREP is returned to
    the sender.  The error text should include "SASLprep" and the SASLprep
    version number specified in the IANA StringPrep registry.  Perhaps
    including the unassigned code points. (?)

    NOTE: this mechanism provides the KDC


    KerberosUnpreparedString := KerberosString (UTF8: receiver prepared)

    Kerberos Unprepared Strings are sent unprepared and must be prepared
    by the receiver.  This string type is similar to a Query String except
    that the source of the input string is a protocol which does not
    implement SASLprep normalization but which is capable of transmitting
    internationalized UTF-8 strings.


    KerberosPasswordString := KerberosQueryString

    Passwords are not transmitted within the Kerberos protocol.  However,  
    they are used as input to cryptographic operations which require
    unique and identical representations on both the Kerberos client and
    the KDC.  Kerberos Passwords are therefore treated as Kerberos Query
    Strings on the client requiring preparation via SASLprep prior to

    (?) Representation of password strings in a Kerberos Database are
    left up to implementors.  However, prior to using a password string
    as input to a cryptographic operating defined by [@KCRYPTO] the
    string must be prepared via SASLprep.


    KerberosSaltString := KerberosQueryString

    Salts used in conjunction with the Kerberos protocol must be prepared
    at the time they are inserted into the Kerberos database.

    PrincipalQueryString := KerberosQueryString


    PrincipalStorageString := KerberosStorageString

    (?) This string type is not utilized within [@EXTENSIONS] but is provided
    for reference by protocols which rely on [@EXTENSIONS]


    RealmQueryString := KerberosQueryString


    RealmStorageString := KerberosStorageString

    (?) This string type is not utilized within [@EXTENSIONS] but is provided
    for reference by protocols which rely on [@EXTENSIONS]

(?) All references within the ASN.1 to Principal, Realm, Salt, or ErrorText
are to be replaced with references to PrincipalQueryString, RealmQueryString,
KerberosSaltString, or KerberosDisplayString as appropriate.

Migration Strategy for KerberosString utilization between [@CLARIFICATIONS]

Kerberos protocol messages compatible with [@CLARIFICATIONS] are constrained to
utilizing KerberosString drived types in their GeneralString (IA5String)
form.  Kerberos protocol messages newly defined in [@EXTENSIONS], this
document, are constrained to utilizing KerberosString derived types in their
UTF8String form.

Interoperability between mixed environments of [@CLARIFICATIONS] and
[@EXTENSIONS] in ensured only when the principal and realm names involved in
the protocol operations are representable without modification in the
GeneralString (IA5String).

If a [@EXTENSIONS] KDC needs to send a KRB_AS_REP to an old client, and the
salt cannot be represented, KDC_ERR_ETYPE_NOSUPP is returned.

[@CLARIFICATIONS] Service and [@EXTENSIONS] Client.  If the client has a
principal which cannot be represented in a downgrade to the old service.  In
this case a [new] error KDC_ERR_SERVICE_TOO_OLD will be sent.

When performing cross realm operations between a mixture of [@CLARIFICATIONS]
and [@EXTENSIONS], if there may be a need to downgrade the contents of the
transited realm field to [@CLARIFICATIONS] compatible message forms.  During a
downgrade operation, if the contents of the transited realm field cannot be
represented in a GeneralString (IA5String), then a [new] error

Note For Implementors who used GeneralString as OctetString (Just Send Eight):

Implementations of [@EXTENSIONS] which receive messages from [@CLARIFICATIONS]
peers which treat GeneralString as OctetString must perform string preparation
on the received query strings derived from KerberosString.  First, the string
is converted from the locale specific character-set in use to Unicode.  The
Unicode string is then prepared with SASLprep before further use.

{??? how is the [@EXTENSIONS] implementation supposed to know how to apply
 NamePrep to domain name labels ???}

If there are any characters which cannot be converted from the locale-specific
character-set to Unicode; or if string preparation fails, an Error must be sent.

Best Common Practice Recommendations for the Processing of Internationalized
Domain-Style Realm Names:

Domain Style Realm names are defined as all realm names whose components are
separated by Full Stop (0x002E) (aka periods, '.') and contain neither colons,
':', nor forward slashes, '/'.  When establishing a new domain-style realm
name containing one or more internationalized characters (not included in
US-ASCII), this procedure must be used to :

 * the realm name must be a valid domain name as per the rules of IDNA [RFC3490]

 * the following string preparation routine must be followed:

   - separate the string into components separated by any of the Full
     Stop characters
   - fold all Full Stop characters to Full Stop (0x002E)

   - for each component (perform all steps):

      . if the component begins with an ACE prefix as registered with IANA,
        the prefix will be removed and the rest of the component will be
        converted from the ACE representation to Unicode [need reference]
      . if the component consists of one or more internationalized characters
        separately apply the NamePrep and SASLprep string preparation methods.
      . if the output of the two methods match, continue on to the next
        component; otherwise reject the realm name as invalid

   - the result of a valid realm name is the joining of the individual
     string prepared components separated by the Full Stop (0x002E)

In [@CLARIFICATIONS], the recommendation is that all domain style realm
names be represented in uppercase.  This recommendation is modified in the
following manner.  All components of domain style realm names not including
internationalized characters should be upper-cased.  All components of domain  
style realm names including internationalized characters must be
lower-cased.  (The lower case representation of internationalized components
is enforced by the requirement that the output of NamePrep and StringPrep
string preparation must be equivalent.)

Best Common Practice Recommendations for the Processing of Principal Names
Consisting of Internationalized Domain Names:

Kerberos principals are often created for the purpose of authenticating a
service located on a machine identified by an domain name.  Unfortunately,
once a principal name is created it is impossible to know the source from
which the resulting KerberosString was derived.  It is therefore required
that principal names containing internationalized domain names be processed
via the following procedure:

 * ensure that the IDN component must be a valid domain name as per the
   rules of IDNA [RFC3490]

 * separate the IDN component into labels separated by any of the Full Stop
 * fold all Full Stop characters to Full Stop (0x002E)

 * for each label (perform all steps):

      . if the label begins with an ACE prefix as registered with IANA,
        the prefix will be removed and the rest of the label will be
        converted from the ACE representation to Unicode [need reference]
      . if the label consists of one or more internationalized characters
        separately apply the NamePrep and then the SASLprep string preparation methods.

      . if the label consists of zero internalizationalized characters, the
        label is to be lower-cased
      . if the output of the two methods match, continue on to the next
        label; otherwise reject the principal name as invalid

   - the result of a valid principal name component derived from an IDN is the
     joining of the individual string prepared labels separated by the
     Full Stop (0x002E)

Negative Authorization Database - Security Considerations

Proposal: KDC Should Perform Normalization on Query Strings and perform
necessary operations

Internationalization Considerations Section: (should be mandatory)
  * When displaying i18n principal names the bidi rules must be applied
    (UAX 9 - not perfect) [see IRI recommended text]

TO DO: Verify matching semantics of "Unicode Variation Selectors" which for
the most part are mapped to nothing in SASLPrep.

smime.p7s (4K) Download Attachment