krb5 commit: Modernize UTF-8/UCS-2 conversion code

Tue Apr 25 13:31:25 EDT 2017

On Tue, 2017-04-18 at 10:32 +0200, Stefan Metzmacher wrote:
> Am 17.04.2017 um 21:53 schrieb Greg Hudson:
> > On 04/17/2017 03:24 PM, Stefan Metzmacher wrote:
> >> is there a reson why you still limit this to USC-2 and not use full UTF16
> >> support?
> > 
> > This commit was just trying to eliminate warnings, not change the
> > functionality.
> 
> Ok, "Modernize UTF-8/UCS-2 conversion code" indicated a more or less
> rewrite of the code to me. As it's not you may just ignore this topic
> for now.
> 
> > This code originally came from OpenLDAP's utf-8-conv.c, and was
> > incorporated with the goal of Windows compatibility (for PACs and RC4
> > string-to-key).  If Windows now supports UTF-16 then we could consider
> > extending it, but I wouldn't want to do so without confirmation of that.
> 
> I think Windows always supported UTF-16 instead of USC-2, at least it
> did for the last 10 years (at least for passwords).
> 
> >> We had a lot of trouble with the limited unicode support of krb5 libraries
> >> in Samba recently, see https://bugzilla.samba.org/show_bug.cgi?id=12262
> 
> The core of the problem is that Windows uses an random buffer as machine
> password,
> and interprets it as UTF16 (which is not always valid UTF-16).
> The MD4 of the raw buffer is the NTHASH/RC4-key.

Would it make sense to provide an alternative interface to calculate the
RC4-key that takes a random buffer instead of utf-8, so we do not have
to deal with awkward conversions or restricting how we generate this
buffer ?

Simo.

> In order to calculate the AES-keys the buffer is converted to (valid)
> UTF-8 first.
> The following rules are applied to the raw buffer first:
> 1) any 0x0000 characters are mapped to 0x0001
> 2) convert any instance of 0xD800 - 0xDBFF (high surrogate)
>    without an immediately following 0xDC00 - 0x0xDFFF (low surrogate) to
>    U+FFFD (OBJECT REPLACEMENT CHARACTER).
> 3) the same for any low surrogate that was not preceded by a high surrogate.
> 
> Windows stores the raw buffer as master copy of the password,
> while Samba currently stores the UTF-8 password.
> 
> If we pass the UTF-8 to the krb5 libraries we may calculate a different
> NTHASH,
> because the transition from the random (UTF-16-like) buffer to valid UTF-8
> and back to valid UTF-16 looses information.
> 
> So our current solution is limiting the machine password to valid utf-8,
> with codepoints up to 0xffff.
> 
> In future we may try to do pass a keytab instead of password to the krb5
> libraries
> and calculate the keys ourself in order to avoid these problems.
> 
> So there's currently no immediate action required.
> 
> metze
> 
> _______________________________________________
> krbdev mailing list             krbdev at mit.edu
> https://mailman.mit.edu/mailman/listinfo/krbdev

-- 
Simo Sorce
Sr. Principal Software Engineer
Red Hat, Inc