[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1883cf6b-7756-405b-a843-d8ae31e01d61@ghiti.fr>
Date: Mon, 25 Mar 2024 08:25:28 +0100
From: Alexandre Ghiti <alex@...ti.fr>
To: Arnd Bergmann <arnd@...db.de>, Samuel Holland
<samuel.holland@...ive.com>, Alexandre Ghiti <alexghiti@...osinc.com>
Cc: Palmer Dabbelt <palmer@...belt.com>, linux-riscv@...ts.infradead.org,
Albert Ou <aou@...s.berkeley.edu>, Andrew Morton
<akpm@...ux-foundation.org>, Charlie Jenkins <charlie@...osinc.com>,
guoren <guoren@...nel.org>, Jisheng Zhang <jszhang@...nel.org>,
Kemeng Shi <shikemeng@...weicloud.com>, Matthew Wilcox
<willy@...radead.org>, Mike Rapoport <rppt@...nel.org>,
Paul Walmsley <paul.walmsley@...ive.com>, Xiao W Wang
<xiao.w.wang@...el.com>, Yangyu Chen <cyy@...self.name>,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH] riscv: Define TASK_SIZE_MAX for __access_ok()
Hi Arnd,
On 24/03/2024 23:05, Arnd Bergmann wrote:
> On Tue, Mar 19, 2024, at 17:51, Alexandre Ghiti wrote:
>> On 18/03/2024 22:29, Samuel Holland wrote:
>>> On 2024-03-18 3:50 PM, Alexandre Ghiti wrote:
>>>> On Wed, Mar 13, 2024 at 7:00 PM Samuel Holland
>>> It looks like the call to fixup_exception() [added
>>> in 416721ff05fd ("riscv, mm: Perform BPF exhandler fixup on page fault")] is
>>> only intended to catch null pointer dereferences. So making the change wouldn't
>>> have any functional impact, but it would still be a valid optimization.
>>>
>>>> Or I was wondering if it would not be better to do like x86 and use an
>>>> alternative, it would be more correct (even though I believe your
>>>> solution works)
>>>> https://elixir.bootlin.com/linux/latest/source/arch/x86/include/asm/page_64.h#L82.
>>> What would be the benefit of using an alternative? Any access to an address
>>> between TASK_SIZE and TASK_SIZE_MAX is guaranteed to generate a page fault, so
>>> the only benefit I see is returning -EFAULT slightly faster at the cost of
>>> applying a few hundred alternatives at boot. But it's possible I'm missing
>>> something.
>>
>> The use of alternatives allows to return right away if the buffer is
>> beyond the usable user address space, and it's not just "slightly
>> faster" for some cases (a very large buffer with only a few bytes being
>> beyond the limit or someone could fault-in all the user pages and fail
>> very late...etc). access_ok() is here to guarantee that such situations
>> don't happen, so actually it makes more sense to use an alternative to
>> avoid that.
> The access_ok() function really wants a compile-time constant
> value for TASK_SIZE_MAX so it can do constant folding for
> repeated calls inside of one function, so for configurations
> with a boot-time selected TASK_SIZE_64 it's already not ideal,
> with or without alternatives.
>
> If I read the current code correctly, riscv doesn't even
> have a way to build with a compile-time selected
> VA_BITS/PGDIR_SIZE, which is probably a better place to
> start optimizing, since this rarely needs to be selected
> dynamically.
Indeed, we do not support compile-time fixed VA_BITS! We could, but that
would only be used for custom kernels. I don't think distro kernels will
ever (?) propose 3 different kernels for sv39, sv48 and sv57 because the
cost of dynamically choosing the address space width is not big enough
to me (and the burden of maintaining 3 different kernels is).
Let me know if I'm wrong, I'd be happy to work on that.
Thanks,
Alex
>
> Arnd
Powered by blists - more mailing lists