linux-kernel - Re: [PATCH] nls: add surrogate pair support in nls utf8.

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <Pine.LNX.4.44L0.1112061102270.1469-100000@iolanthe.rowland.org>
Date:	Tue, 6 Dec 2011 11:10:02 -0500 (EST)
From:	Alan Stern <stern@...land.harvard.edu>
To:	Namjae Jeon <linkinjeon@...il.com>
cc:	akpm@...ux-foundation.org, <linux-kernel@...r.kernel.org>,
	Ashish Sangwan <ashishsangwan2@...il.com>
Subject: Re: [PATCH] nls: add surrogate pair support in nls utf8.

On Tue, 6 Dec 2011, Namjae Jeon wrote:

> > Firstly, have you checked whether the callers of this function expect
> > to receive back more than one 16-bit value?  Maybe you will overrun
> > their buffers by doing this.
> Hi Alan.
> first Thanks for your review.
> yes, you're right. and yes I have checked it on FAT fs,  I will try to
> post the below FAT patch after this patch is applied.

But what about other callers?  Have you checked the entire kernel 
source to see if char2uni is used anywhere else?

> --- a/fs/fat/namei_vfat.c
> +++ b/fs/fat/namei_vfat.c
> @@ -555,7 +555,10 @@ xlate_to_uni(const unsigned char *name, int len,
> unsigned char *outname,
>                                                 return -EINVAL;
>                                         ip += charlen;
>                                         i += charlen;
> -                                       op += 2;
> +                                       if (charlen == sizeof(unicode_t))
> +                                               op += 4;
> +                                       else
> +                                               op += 2;

This seems completely wrong.  The amount you increment the output
pointer doesn't depend on the length of the input character; it depends
on whether the output needed to use a surrogate pair.

Furthermore, it doesn't answer my question.  What happens if _every_
character in the input filename has to be converted to a surrogate 
pair?  Then the output string will be twice as long as xlate_to_uni() 
expects, so it might overrun the output buffer.


> > Secondly, you shouldn't have to make all these changes.  Just call
> > utf8s_to_utf16s(); then all you have to worry about is changing an
> > invalid character to a '?'.
> Currently there are two paths along mount option. if using -o utf8
> mount option, utf8s_to_utf16s have been used in xlate_to_uni of FAT.
> and if using iocharset=utf8, char2uni have been used..
> nls->char2uni works on a single charachter while utf8s_to_utf16s works
> on whole string.
> when the mount option(uni_xlate) that require scanning character by
> character is used it is needed.

That doesn't matter.  The point is that you have duplicated the code in 
nls_base.c, which is a bad idea.  Instead of making a copy of the code, 
you should use the code that is already there.  You may find that the 
easiest way is to add a new utf32_to_utf16 function in nls_base.c.

Alan Stern

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/