lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <bb8a2add-2b6e-c35c-ff5b-a7816eeb7e26@oracle.com>
Date:   Wed, 19 Oct 2022 18:36:07 +0000
From:   Jane Chu <jane.chu@...cle.com>
To:     Andy Shevchenko <andriy.shevchenko@...ux.intel.com>
CC:     Petr Mladek <pmladek@...e.com>,
        "rostedt@...dmis.org" <rostedt@...dmis.org>,
        "senozhatsky@...omium.org" <senozhatsky@...omium.org>,
        "linux@...musvillemoes.dk" <linux@...musvillemoes.dk>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        "linux-mm@...ck.org" <linux-mm@...ck.org>,
        Haakon Bugge <haakon.bugge@...cle.com>,
        John Haxby <john.haxby@...cle.com>,
        Jane Chu <jane.chu@...cle.com>
Subject: Re: [PATCH] vsprintf: protect kernel from panic due to non-canonical
 pointer dereference

On 10/18/2022 1:49 PM, Andy Shevchenko wrote:
> On Tue, Oct 18, 2022 at 08:30:01PM +0000, Jane Chu wrote:
>> On 10/18/2022 1:07 PM, Andy Shevchenko wrote:
>>> On Tue, Oct 18, 2022 at 06:56:31PM +0000, Jane Chu wrote:
>>>> On 10/18/2022 5:45 AM, Petr Mladek wrote:
>>>>> On Mon 2022-10-17 19:31:53, Jane Chu wrote:
>>>>>> On 10/17/2022 12:25 PM, Andy Shevchenko wrote:
>>>>>>> On Mon, Oct 17, 2022 at 01:16:11PM -0600, Jane Chu wrote:
>>>>>>>> While debugging a separate issue, it was found that an invalid string
>>>>>>>> pointer could very well contain a non-canical address, such as
>>>>>>>> 0x7665645f63616465. In that case, this line of defense isn't enough
>>>>>>>> to protect the kernel from crashing due to general protection fault
>>>>>>>>
>>>>>>>> 	if ((unsigned long)ptr < PAGE_SIZE || IS_ERR_VALUE(ptr))
>>>>>>>>                     return "(efault)";
>>>>>>>>
>>>>>>>> So instead, use kern_addr_valid() to validate the string pointer.
>>>>>>>
>>>>>>> How did you check that value of the (invalid string) pointer?
>>>>>>>
>>>>>>
>>>>>> In the bug scenario, the invalid string pointer was an out-of-bound
>>>>>> string pointer. While the OOB referencing is fixed,
>>>>>
>>>>> Could you please provide more details about the fixed OOB?
>>>>> What exact vsprintf()/printk() call was broken and eventually
>>>>> how it was fixed, please?
>>>>
>>>> For sensitive reason, I'd like to avoid mentioning the specific name of
>>>> the sysfs attribute in the bug, instead, just call it "devX_attrY[]",
>>>> and describe the precise nature of the issue.
>>>>
>>>> devX_attrY[] is a string array, declared and filled at compile time,
>>>> like
>>>>      const char const devX_attrY[] = {
>>>> 	[ATTRY_A] = "Dev X AttributeY A",
>>>> 	[ATTRY_B] = "Dev X AttributeY B",
>>>> 	...
>>>> 	[ATTRY_G] = "Dev X AttributeY G",
>>>>      }
>>>> such that, when user "cat /sys/devices/systems/.../attry_1",
>>>> "Dev X AttributeY B" will show up in the terminal.
>>>> That's it, no more reference to the pointer devX_attrY[ATTRY_B] after that.
>>>>
>>>> The bug was that the index to the array was wrongfully produced,
>>>> leading up to OOB, e.g. devX_attrY[11].  The fix was to fix the
>>>> calculation and that is not an upstream fix.
>>>>
>>>>>
>>>>>> the lingering issue
>>>>>> is that the kernel ought to be able to protect itself, as the pointer
>>>>>> contains a non-canonical address.
>>>>>
>>>>> Was the pointer used only by the vsprintf()?
>>>>> Or was it accessed also by another code, please?
>>>>
>>>> The OOB pointer was used only by vsprintf() for the "cat" sysfs case.
>>>> No other code uses the OOB pointer, verified both by code examination
>>>> and test.
>>>
>>> So, then the vsprintf() is _the_ point to crash and why should we hide that?
>>> Because of the crash you found the culprit, right? The efault will hide very
>>> important details.
>>>
>>> So to me it sounds like I like this change less and less...
>>
>> What about the existing check
>>    	if ((unsigned long)ptr < PAGE_SIZE || IS_ERR_VALUE(ptr))
>>                       return "(efault)";
>> ?
> 
> Because it's _special_. We know that First page is equivalent to a NULL pointer
> and the last one is dedicated for so called error pointers. There are no more
> special exceptions to the addresses in the Linux kernel (I don't talk about
> alignment requirements by the certain architectures).
> 
>> In an experiment just to print the raw OOB pointer values, I saw below
>> (the devX attrY stuff are substitutes of the real attributes, other
>> values and strings are verbatim copy from "dmesg"):
>>
>> [ 3002.772329] devX_attrY[26]: (ffffffff84d60ad3) Dev X AttributeY E
>> [ 3002.772346] devX_attrY[27]: (ffffffff84d60ae4) Dev X AttributeY F
>> [ 3002.772347] devX_attrY[28]: (ffffffff84d60aee) Dev X AttributeY G
>> [ 3002.772349] devX_attrY[29]: (0) (null)
>> [ 3002.772350] devX_attrY[30]: (0) (null)
>> [ 3002.772351] devX_attrY[31]: (0) (null)
>> [ 3002.772352] devX_attrY[32]: (7665645f63616465) (einval)
>> [ 3002.772354] devX_attrY[33]: (646e61685f656369) (einval)
>> [ 3002.772355] devX_attrY[34]: (6f635f65755f656c) (einval)
>> [ 3002.772355] devX_attrY[35]: (746e75) (einval)
>>
>> where starting from index 29 are all OOB pointers.
>>
>> As you can see, if the OOBs are NULL, "(null)" was printed due to the
>> existing checking, but when the OOBs are turned to non-canonical which
>> is detectable, the fact the pointer value deviates from
>>     (ffffffff84d60aee + 4 * sizeof(void *))
>> evidently shown that the OOBs are detectable.
>>
>> The question then is why should the non-canonical OOBs be treated
>> differently from NULL and ERR_VALUE?
> 
> Obviously, to see the crash. And let kernel _to crash_. Isn't it what we need
> to see a bug as early as possible?
> 

If the purpose is to see the bug as early as possible, then getting
"(efault)" from reading sysfs attribute would serve the purpose, right?

The fact an OOB pointer has already being turned into either NULL or
non-canonical value implies that *if* kernel code other than
vsprintf() references the pointer, it'll crash else where; but *if* no
other code referencing the pointer, why crash?

thanks,
-jane

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ