[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <74b48c13-712b-40b6-be1c-a79803aee07f@redhat.com>
Date: Fri, 6 Sep 2024 19:08:12 +0800
From: Xiubo Li <xiubli@...hat.com>
To: Luis Henriques <luis.henriques@...ux.dev>
Cc: Ilya Dryomov <idryomov@...il.com>, ceph-devel@...r.kernel.org,
linux-kernel@...r.kernel.org
Subject: Re: [RFC PATCH] ceph: fix out-of-bound array access when doing a file
read
Hi Luis,
Sorry for late reply.
On 8/28/24 23:48, Luis Henriques wrote:
> On Wed, Aug 28 2024, Xiubo Li wrote:
>
>> On 8/27/24 21:36, Luis Henriques wrote:
>>> On Thu, Aug 22 2024, Luis Henriques (SUSE) wrote:
>>>
>>>> If, while doing a read, the inode is updated and the size is set to zero,
>>>> __ceph_sync_read() may not be able to handle it. It is thus easy to hit a
>>>> NULL pointer dereferrence by continuously reading a file while, on another
>>>> client, we keep truncating and writing new data into it.
>>>>
>>>> This patch fixes the issue by adding extra checks to avoid integer overflows
>>>> for the case of a zero size inode. This will prevent the loop doing page
>>>> copies from running and thus accessing the pages[] array beyond num_pages.
>>>>
>>>> Link: https://tracker.ceph.com/issues/67524
>>>> Signed-off-by: Luis Henriques (SUSE) <luis.henriques@...ux.dev>
>>>> ---
>>>> Hi!
>>>>
>>>> Please note that this patch is only lightly tested and, to be honest, I'm
>>>> not sure if this is the correct way to fix this bug. For example, if the
>>>> inode size is 0, then maybe ceph_osdc_wait_request() should have returned
>>>> 0 and the problem would be solved. However, it seems to be returning the
>>>> size of the reply message and that's not something easy to change. Or maybe
>>>> I'm just reading it wrong. Anyway, this is just an RFC to see if there's
>>>> other ideas.
>>>>
>>>> Also, the tracker contains a simple testcase for crashing the client.
>>> Just for the record, I've done a quick bisect as this bug is easily
>>> reproducible. The issue was introduced in v6.9-rc1, with commit
>>> 1065da21e5df ("ceph: stop copying to iter at EOF on sync reads").
>>> Reverting it makes the crash go away.
>> Thanks very much Luis.
>>
>> So let's try to find the root cause of it and then improve the patch.
> What's happening is that we have an inode with size 0, but we are not
> checking it's size. The bug is easy to trigger (at least in my test
> environment), and the conditions for it are:
>
> 1) the inode size has to be 0, and
> 2) the read has to return data ('ret = ceph_osdc_wait_request()').
>
> This will lead to 'left' being set to huge values due to the overflow in:
>
> left = i_size - off;
>
> However, some times (maybe most of the time) __ceph_sync_read() will not
> crash and will return -EFAULT instead. In the 'while (left > 0) { ... }'
> loop, the condition '(copied < plen)' will be true and this error is
> returned in the first iteration of the loop.
>
> So, here's a much simpler approach to fix this issue: to bailout if we
> have a 0-sized inode. What do you think?
I saw your V2 let's discuss there.
Thanks
- Xiubo
>
> Cheers,
Powered by blists - more mailing lists