[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <ea7c7077-7ad1-4ee2-a80e-13b5eb291a7e@intel.com>
Date: Tue, 14 Jan 2025 13:13:36 -0800
From: Dave Hansen <dave.hansen@...el.com>
To: Lorenzo Stoakes <lorenzo.stoakes@...cle.com>,
"Liam R. Howlett" <Liam.Howlett@...cle.com>, dave.hansen@...ux.intel.com,
kirill.shutemov@...ux.intel.com, Shakeel Butt <shakeel.butt@...ux.dev>,
SeongJae Park <sj@...nel.org>, David Hildenbrand <david@...hat.com>,
Vlastimil Babka <vbabka@...e.cz>, Andrew Morton <akpm@...ux-foundation.org>,
Jens Axboe <axboe@...nel.dk>, Pavel Begunkov <asml.silence@...il.com>,
linux-kernel@...r.kernel.org, linux-mm@...ck.org,
Ryan Roberts <ryan.roberts@....com>
Subject: Re: untagged_addr_remote() in do_madvise()
On 1/14/25 12:41, Lorenzo Stoakes wrote:
...
> However, MADV_HWPOISON, MADV_SOFT_OFFLINE seems fundamentally broken for tagged
> addresses:
>
> #ifdef CONFIG_MEMORY_FAILURE
> if (behavior == MADV_HWPOISON || behavior == MADV_SOFT_OFFLINE)
> return madvise_inject_error(behavior, start, start + len_in);
> #endif
>
> ^ this is invoked before untagged_addr_remote() is called (as no mmap lock is
> acquired) and so no attempt at untagging happens at all...!
Except this call path:
madvise_inject_error() ->
get_user_pages_fast() ->
gup_fast_fallback()
does its own untagging:
start = untagged_addr(start) & PAGE_MASK;
It might also have some funky behavior if start+len_in overflows. But,
just as in the other case, it's invalid to begin with so I think
userspace kinda gets to keep the pieces.
But I do 100% agree that this is non-obvious. In a perfect world, tagged
addresses would get untagged at the user/kernel boundary in _one_ choke
point. But the world is hard and that would make things too easy and
then we wouldn't get paid the big bucks. ;)
To clarify things, I don't think it'd be the worst thing to just move
the madvise_inject_error() down and have that case acquire
mmap_read_lock(). Sure, it's not required, but it's basically debugging
code and I can't imagine it's avoiding the lock for performance reasons.
Powered by blists - more mailing lists