[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <48bb5f01-b82b-79a7-dbc6-6ec91bcaab67@huaweicloud.com>
Date: Mon, 11 Nov 2024 08:56:50 +0800
From: Yu Kuai <yukuai1@...weicloud.com>
To: Chuck Lever III <chuck.lever@...cle.com>,
Yu Kuai <yukuai1@...weicloud.com>, Al Viro <viro@...iv.linux.org.uk>,
Christian Brauner <brauner@...nel.org>
Cc: Greg KH <gregkh@...uxfoundation.org>,
linux-stable <stable@...r.kernel.org>,
"harry.wentland@....com" <harry.wentland@....com>,
"sunpeng.li@....com" <sunpeng.li@....com>,
"Rodrigo.Siqueira@....com" <Rodrigo.Siqueira@....com>,
"alexander.deucher@....com" <alexander.deucher@....com>,
"christian.koenig@....com" <christian.koenig@....com>,
"Xinhui.Pan@....com" <Xinhui.Pan@....com>,
"airlied@...il.com" <airlied@...il.com>, Daniel Vetter <daniel@...ll.ch>,
Liam Howlett <liam.howlett@...cle.com>,
Andrew Morton <akpm@...ux-foundation.org>, Hugh Dickins <hughd@...gle.com>,
"Matthew Wilcox (Oracle)" <willy@...radead.org>,
Sasha Levin <sashal@...nel.org>,
"srinivasan.shanmugam@....com" <srinivasan.shanmugam@....com>,
"chiahsuan.chung@....com" <chiahsuan.chung@....com>,
"mingo@...nel.org" <mingo@...nel.org>,
"mgorman@...hsingularity.net" <mgorman@...hsingularity.net>,
"chengming.zhou@...ux.dev" <chengming.zhou@...ux.dev>,
"zhangpeng.00@...edance.com" <zhangpeng.00@...edance.com>,
"amd-gfx@...ts.freedesktop.org" <amd-gfx@...ts.freedesktop.org>,
"dri-devel@...ts.freedesktop.org" <dri-devel@...ts.freedesktop.org>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
Linux FS Devel <linux-fsdevel@...r.kernel.org>,
"maple-tree@...ts.infradead.org" <maple-tree@...ts.infradead.org>,
linux-mm <linux-mm@...ck.org>, "yi.zhang@...wei.com" <yi.zhang@...wei.com>,
yangerkun <yangerkun@...wei.com>, "yukuai (C)" <yukuai3@...wei.com>
Subject: Re: [PATCH 6.6 00/28] fix CVE-2024-46701
Hi,
在 2024/11/10 0:58, Chuck Lever III 写道:
>
>
>> On Nov 8, 2024, at 8:30 PM, Yu Kuai <yukuai1@...weicloud.com> wrote:
>>
>> Hi,
>>
>> 在 2024/11/08 21:23, Chuck Lever III 写道:
>>>> On Nov 7, 2024, at 8:19 PM, Yu Kuai <yukuai1@...weicloud.com> wrote:
>>>>
>>>> Hi,
>>>>
>>>> 在 2024/11/07 22:41, Chuck Lever 写道:
>>>>> On Thu, Nov 07, 2024 at 08:57:23AM +0800, Yu Kuai wrote:
>>>>>> Hi,
>>>>>>
>>>>>> 在 2024/11/06 23:19, Chuck Lever III 写道:
>>>>>>>
>>>>>>>
>>>>>>>> On Nov 6, 2024, at 1:16 AM, Greg KH <gregkh@...uxfoundation.org> wrote:
>>>>>>>>
>>>>>>>> On Thu, Oct 24, 2024 at 09:19:41PM +0800, Yu Kuai wrote:
>>>>>>>>> From: Yu Kuai <yukuai3@...wei.com>
>>>>>>>>>
>>>>>>>>> Fix patch is patch 27, relied patches are from:
>>>>>>>
>>>>>>> I assume patch 27 is:
>>>>>>>
>>>>>>> libfs: fix infinite directory reads for offset dir
>>>>>>>
>>>>>>> https://lore.kernel.org/stable/20241024132225.2271667-12-yukuai1@huaweicloud.com/
>>>>>>>
>>>>>>> I don't think the Maple tree patches are a hard
>>>>>>> requirement for this fix. And note that libfs did
>>>>>>> not use Maple tree originally because I was told
>>>>>>> at that time that Maple tree was not yet mature.
>>>>>>>
>>>>>>> So, a better approach might be to fit the fix
>>>>>>> onto linux-6.6.y while sticking with xarray.
>>>>>>
>>>>>> The painful part is that using xarray is not acceptable, the offet
>>>>>> is just 32 bit and if it overflows, readdir will read nothing. That's
>>>>>> why maple_tree has to be used.
>>>>> A 32-bit range should be entirely adequate for this usage.
>>>>> - The offset allocator wraps when it reaches the maximum, it
>>>>> doesn't overflow unless there are actually billions of extant
>>>>> entries in the directory, which IMO is not likely.
>>>>
>>>> Yes, it's not likely, but it's possible, and not hard to trigger for
>>>> test.
>>> I question whether such a test reflects any real-world
>>> workload.
>>> Besides, there are a number of other limits that will impact
>>> the ability to create that many entries in one directory.
>>> The number of inodes in one tmpfs instance is limited, for
>>> instance.
>>>> And please notice that the offset will increase for each new file,
>>>> and file can be removed, while offset stays the same.
>>
>> Did you see the above explanation? files can be removed, you don't have
>> to store that much files to trigger the offset to overflow.
>>>>> - The offset values are dense, so the directory can use all 2- or
>>>>> 4- billion in the 32-bit integer range before wrapping.
>>>>
>>>> A simple math, if user create and remove 1 file in each seconds, it will
>>>> cost about 130 years to overflow. And if user create and remove 1000
>>>> files in each second, it will cost about 1 month to overflow.
>
>> The problem is that if the next_offset overflows to 0, then after patch
>> 27, offset_dir_open() will record the 0, and later offset_readdir will
>> return directly, while there can be many files.
>
>
> Let me revisit this for a moment. The xa_alloc_cyclic() call
> in simple_offset_add() has a range limit argument of 2 - U32_MAX.
>
> So I'm not clear how an overflow (or, more precisely, the
> reuse of an offset value) would result in a "0" offset being
> recorded. The range limit prevents the use of 0 and 1.
>
> A "0" offset value would be a bug, I agree, but I don't see
> how that can happen.
>
>
>>> The question is what happens when there are no more offset
>>> values available. xa_alloc_cyclic should fail, and file
>>> creation is supposed to fail at that point. If it doesn't,
>>> that's a bug that is outside of the use of xarray or Maple.
>>
>> Can you show me the code that xa_alloc_cyclic should fail? At least
>> according to the commets, it will return 1 if the allocation succeeded
>> after wrapping.
>>
>> * Context: Any context. Takes and releases the xa_lock. May sleep if
>> * the @gfp flags permit.
>> * Return: 0 if the allocation succeeded without wrapping. 1 if the
>> * allocation succeeded after wrapping, -ENOMEM if memory could not be
>> * allocated or -EBUSY if there are no free entries in @limit.
>> */
>> static inline int xa_alloc_cyclic(struct xarray *xa, u32 *id, void *entry,
>> struct xa_limit limit, u32 *next, gfp_t gfp)
>
> I recall (dimly) that directory entry offset value re-use
> is acceptable and preferred, so I think ignoring a "1"
> return value from xa_alloc_cyclic() is OK. If there are
> no unused offset values available, it will return -EBUSY,
> and file creation will fail.
>
> Perhaps Christian or Al can chime in here on whether
> directory entry offset value re-use is indeed expected
> to be acceptable.
This can't be acceptable in this case, the reason is straightforward,
it will mess readdir, and this is mucth more serious than the cve
itself.
Thanks,
Kuai
>
> Further, my understanding is that:
>
> https://lore.kernel.org/stable/20241024132225.2271667-12-yukuai1@huaweicloud.com/
>
> fixes a rename issue that results in an infinite loop,
> and that's the (only) issue that underlies CVE-2024-46701.
>
> You are suggesting that there are other overflow problems
> with the xarray-based simple_offset implementation. If I
> can confirm them, then I can get these fixed in v6.6. But
> so far, I'm not sure I completely understand these other
> failure modes.
>
> Are you suggesting that the above fix /introduces/ the
> 0 offset problem?
>
> --
> Chuck Lever
>
>
Powered by blists - more mailing lists