linux-kernel - Re: [RFC PATCH] ceph: fix out-of-bound array access when doing a file read

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <74b48c13-712b-40b6-be1c-a79803aee07f@redhat.com>
Date: Fri, 6 Sep 2024 19:08:12 +0800
From: Xiubo Li <xiubli@...hat.com>
To: Luis Henriques <luis.henriques@...ux.dev>
Cc: Ilya Dryomov <idryomov@...il.com>, ceph-devel@...r.kernel.org,
 linux-kernel@...r.kernel.org
Subject: Re: [RFC PATCH] ceph: fix out-of-bound array access when doing a file
 read

Hi Luis,

Sorry for late reply.

On 8/28/24 23:48, Luis Henriques wrote:
> On Wed, Aug 28 2024, Xiubo Li wrote:
>
>> On 8/27/24 21:36, Luis Henriques wrote:
>>> On Thu, Aug 22 2024, Luis Henriques (SUSE) wrote:
>>>
>>>> If, while doing a read, the inode is updated and the size is set to zero,
>>>> __ceph_sync_read() may not be able to handle it.  It is thus easy to hit a
>>>> NULL pointer dereferrence by continuously reading a file while, on another
>>>> client, we keep truncating and writing new data into it.
>>>>
>>>> This patch fixes the issue by adding extra checks to avoid integer overflows
>>>> for the case of a zero size inode.  This will prevent the loop doing page
>>>> copies from running and thus accessing the pages[] array beyond num_pages.
>>>>
>>>> Link: https://tracker.ceph.com/issues/67524
>>>> Signed-off-by: Luis Henriques (SUSE) <luis.henriques@...ux.dev>
>>>> ---
>>>> Hi!
>>>>
>>>> Please note that this patch is only lightly tested and, to be honest, I'm
>>>> not sure if this is the correct way to fix this bug.  For example, if the
>>>> inode size is 0, then maybe ceph_osdc_wait_request() should have returned
>>>> 0 and the problem would be solved.  However, it seems to be returning the
>>>> size of the reply message and that's not something easy to change.  Or maybe
>>>> I'm just reading it wrong.  Anyway, this is just an RFC to see if there's
>>>> other ideas.
>>>>
>>>> Also, the tracker contains a simple testcase for crashing the client.
>>> Just for the record, I've done a quick bisect as this bug is easily
>>> reproducible.  The issue was introduced in v6.9-rc1, with commit
>>> 1065da21e5df ("ceph: stop copying to iter at EOF on sync reads").
>>> Reverting it makes the crash go away.
>> Thanks very much Luis.
>>
>> So let's try to find the root cause of it and then improve the patch.
> What's happening is that we have an inode with size 0, but we are not
> checking it's size.  The bug is easy to trigger (at least in my test
> environment), and the conditions for it are:
>
>   1) the inode size has to be 0, and
>   2) the read has to return data ('ret = ceph_osdc_wait_request()').
>
> This will lead to 'left' being set to huge values due to the overflow in:
>
> 	left = i_size - off;
>
> However, some times (maybe most of the time) __ceph_sync_read() will not
> crash and will return -EFAULT instead.  In the 'while (left > 0) { ... }'
> loop, the condition '(copied < plen)' will be true and this error is
> returned in the first iteration of the loop.
>
> So, here's a much simpler approach to fix this issue: to bailout if we
> have a 0-sized inode.  What do you think?

I saw your V2 let's discuss there.

Thanks

- Xiubo


>
> Cheers,