non-ascii password in kerberos authentication

Wed Oct 31 01:40:35 EDT 2007

> -----Original Message-----
> From: Jeffrey Altman [mailto:jaltman at secure-endpoints.com]
> Sent: Wednesday, October 31, 2007 12:22 PM
> To: Xu Qiang
> Cc: Ken Raeburn; krbdev at mit.edu; Su Huang (FXSGSC) Yi
> Subject: Re: non-ascii password in kerberos authentication
> 
> The history of ISO-Latin-9 is that it was created many years after
> ISO-Latin-1 to ISO-Latin-8 had been standardized.  When the Euro was
> created there was an obvious need to add the character, but you can't
> change the standard after it is published and ISO-Latin character sets
> follow the rules of ISO-2022 which prohibits printable 
> characters in the
>  the C1 control character range.  As a result they had to 
> create the new
> ISO-Latin-9 character set to include the Euro character for Western
> Europe by replacing the US Currency symbol.  Since they are different
> characters, they need to be different character sets.
> 
> Microsoft's ANSI character sets are closer in heritage to the IBM Code
> Pages.  IBM CP850 was the Western European character set that included
> all of the characters of ISO-Latin-1 plus the box drawing 
> characters and
> many other characters used within Western Europe that could not fit in
> ISO-Latin-1.  The reason these additional characters could fit in the
> IBM Code Pages is that unlike the ISO-2022 based character 
> sets, the IBM
> Code Pages did not reserve the C1 control character range.
> 
> Microsoft's ANSI character sets like the IBM Code Pages 
> (which Microsoft
> calls OEM Character Sets) do not reserve the C1 control 
> character range
> and therefore there was room to support both the US Currency and Euro
> characters within the ANSI Latin-1 character set.
> 
> For additional historical reference, when the Euro character was
> originally introduced IBM modified CP850 to include it by 
> replacing the
> dotless-i (0xD5) which is used extensively in Turkey.  This produced
> significant backlash which resulted in the introduction of CP858 with
> the Euro character and restoration of the dotless-i to CP850. 
>  However,
> the damage had already been done.  The ISO committees decided against
> repeating IBM's folly.
> 
> Microsoft in its infinite wisdom decided that the OEM Code 
> Pages needed
> replacing and created a matching "ANSI" code page for each of the ISO
> Latin character sets.  The Code Page 1252 is called "ANSI Latin-1" and
> includes all of the characters from ISO Latin-1 at the same 
> code points.
>   However, it also adds a number of characters not found in 
> ISO Latin-1
> and there is not a one-to-one mapping with characters in the OEM Code
> Pages.  The original version of Code Page 1252 did not 
> include the Euro
> character.  Nor did it include the "S with caron" or "Z with caron"
> characters.  These have been added over time and there are still five
> unused code points that can be assigned values at some time 
> in the future.

Jeff, thanks a lot for your detailed information on how the euro symbol comes into 
the character set.

> No it won't.  Your issue is that there is not a one-to-one mapping
> between the code points in the Windows ANSI Latin-1 (CP1252) character
> set and Unicode.  As a result, in order to solve this problem you are
> going to have to implement some way of communicating the character set
> used by your application to the Kerberos library and then replace the
> dumb NUL-stuffing algorithm with one that actually performs a
> character-set translation.
> 
> If you know that your application always uses only a single character
> set, then you could (for your own distribution) bypass the 
> character-set
> communication and simply replace the NUL-stuffing code with
> character-set translation routines for CP1252 to Unicode UCS-2LE.

Yes, I would like to change the NUL-stuffing code a little bit to support the 
euro sign so that it can be used in RC4-encrypted password. 

But for non-developers like me, looking through to find the exact location 
of the NUL-stuffing code is a headache. Just hope that the developers can 
give us some suggestions on how to find those code.

Thanks a lot,
Xu Qiang