ASN encoding
Jeffrey Altman
jaltman2 at nyc.rr.com
Fri Oct 31 21:59:05 EST 2003
the only valid characters which may be used in RFC1510 implementations
of Kerberos within GeneralString fields are those contained in US-ASCII.
The following text is quoted from:
draft-ietf-krb-wg-kerberos-clarifications-04.txt
5.2.1. KerberosString
The original specification of the Kerberos protocol in RFC 1510 uses
GeneralString in numerous places for human-readable string data.
Historical implementations of Kerberos cannot utilize the full power
of GeneralString. This ASN.1 type requires the use of designation
and invocation escape sequences as specified in ISO-2022/ECMA-35
[ISO-2022/ECMA-35] to switch character sets, and the default
character set that is designated as G0 is the ISO-646/ECMA-6
[ISO-646,ECMA-6] International Reference Version (IRV) (aka U.S.
ASCII), which mostly works.
ISO-2022/ECMA-35 defines four character-set code elements (G0..G3)
and two Control-function code elements (C0..C1). DER prohibits the
designation of character sets as any but the G0 and C0 sets.
Unfortunately, this seems to have the side effect of prohibiting the
use of ISO-8859 (ISO Latin) [ISO-8859] character-sets or any other
character-sets that utilize a 96-character set, since it is
prohibited by ISO-2022/ECMA-35 to designate them as the G0 code
element. This side effect is being investigated in the ASN.1
standards community.
In practice, many implementations treat GeneralStrings as if they
were 8-bit strings of whichever character set the implementation
defaults to, without regard for correct usage of character-set
designation escape sequences. The default character set is often
determined by the current user's operating system dependent locale.
At least one major implementation places unescaped UTF-8 encoded
Unicode characters in the GeneralString. This failure to adhere to
the GeneralString specifications results in interoperability issues
when conflicting character encodings are utilized by the Kerberos
clients, services, and KDC.
This unfortunate situation is the result of improper documentation of
the restrictions of the ASN.1 GeneralString type in prior Kerberos
specifications.
The new (post-RFC 1510) type KerberosString, defined below, is a
GeneralString that is constrained to only contain characters in
IA5String
KerberosString ::= GeneralString (IA5String)
US-ASCII control characters should in general not be used in
KerberosString, except for cases such as newlines in lengthy error
messages. Control characters SHOULD NOT be used in principal names or
realm names.
For compatibility, implementations MAY choose to accept GeneralString
values that contain characters other than those permitted by
IA5String, but they should be aware that character set designation
codes will likely be absent, and that the encoding should probably be
treated as locale-specific in almost every way. Implementations MAY
also choose to emit GeneralString values that are beyond those
permitted by IA5String, but should be aware that doing so is
extraordinarily risky from an interoperability perspective.
Some existing implementations use GeneralString to encode unescaped
locale-specific characters. This is a violation of the ASN.1
standard. Most of these implementations encode US-ASCII in the left-
hand half, so as long the implementation transmits only US-ASCII, the
ASN.1 standard is not violated in this regard. As soon as such an
implementation encodes unescaped locale-specific characters with the
high bit set, it violates the ASN.1 standard.
Other implementations have been known to use GeneralString to contain
a UTF-8 encoding. This also violates the ASN.1 standard, since UTF-8
is a different encoding, not a 94 or 96 character "G" set as defined
by ISO 2022. It is believed that these implementations do not even
use the ISO 2022 escape sequence to change the character encoding.
Even if implementations were to announce the change of encoding by
using that escape sequence, the ASN.1 standard prohibits the use of
any escape sequences other than those used to designate/invoke "G" or
"C" sets allowed by GeneralString.
Future revisions to this protocol will almost certainly allow for a
more interoperable representation of principal names, probably
including UTF8String.
Note that applying a new constraint to a previously unconstrained
type constitutes creation of a new ASN.1 type. In this particular
case, the change does not result in a changed encoding under DER.
Gustavo Rios wrote:
> Dear gentleman/madam,
>
> i am studing kerberosV (RFC1510) protocol specification. Some data
> types for communication are specified as GeneralString encoding. Then
> i started studying ASN. It came to surprise my that, not only the
> sources of documentation advice against the usage of GeneralString as
> also the own ITU standard. Since, I respectfully request your
> clarification towards this.
>
> How does current Kerberos implementations deal with this? I mean, how
> the current encoding performing... What are the valid characters used
> by the current implementation on the market. (And i am considering MIT
> and HEIMDAL implementation at least, if you known some one else, let
> me know).
>
> Thanks a lot for your time.
>
> PS: My source of information are:
>
> http://asn1.elibel.tm.fr/en/book/index.htm
> http://asn1.elibel.tm.fr/en/standards/index.htm
More information about the Kerberos
mailing list