linux-kernel - Re: untagged_addr_remote() in do

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <a351166f-518c-4322-b26f-d0646f14ab8b@lucifer.local>
Date: Tue, 14 Jan 2025 20:41:41 +0000
From: Lorenzo Stoakes <lorenzo.stoakes@...cle.com>
To: "Liam R. Howlett" <Liam.Howlett@...cle.com>, dave.hansen@...ux.intel.com,
        kirill.shutemov@...ux.intel.com, Shakeel Butt <shakeel.butt@...ux.dev>,
        SeongJae Park <sj@...nel.org>, David Hildenbrand <david@...hat.com>,
        Vlastimil Babka <vbabka@...e.cz>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Jens Axboe <axboe@...nel.dk>, Pavel Begunkov <asml.silence@...il.com>,
        linux-kernel@...r.kernel.org, linux-mm@...ck.org,
        Ryan Roberts <ryan.roberts@....com>
Subject: Re: untagged_addr_remote() in do_madvise()

+cc Kirill for commit

On Tue, Jan 14, 2025 at 02:43:17PM -0500, Liam R. Howlett wrote:
> Hello,
>
> I noticed that mm/madivse.c:do_madvise() calls untagged_addr_remote()
> after validating start.
>
> Looking through git blame shows that this line was moved in
> 428e106ae1ad4 ("mm: Introduce untagged_addr_remote()") [1], with the
> reason being:
>
>     The new helper untagged_addr_remote() has to be used when the address
>     targets remote process. It requires the mmap lock for target mm to be
>     taken.
>
> Although this may be needed, we cannot move the untagging below
> validating the start/end because we have not validated the start/end
> that will be used for the operation, or at least, isn't clear why it's
> okay?
>
> Can anyone tell me why the code today is correct?  That is, how can we
> trust the validation of start/end is still okay after we change the
> start/end by untagging the start?
>
> I think we have to move the locking and the untagging above the
> validation for this to work as expected?
>
> [1] https://lore.kernel.org/all/20230312112612.31869-6-kirill.shutemov@linux.intel.com/
>
> Thanks,
> Liam

To avoid losing context from IRC discussion, seems to me the only check that
needs to be potentially moved is:

	end = start + len;
	if (end < start)
		return -EINVAL;

However, MADV_HWPOISON, MADV_SOFT_OFFLINE seems fundamentally broken for tagged
addresses:

#ifdef CONFIG_MEMORY_FAILURE
	if (behavior == MADV_HWPOISON || behavior == MADV_SOFT_OFFLINE)
		return madvise_inject_error(behavior, start, start + len_in);
#endif

^ this is invoked before untagged_addr_remote() is called (as no mmap lock is
acquired) and so no attempt at untagging happens at all...!

We do need to fix this... unless CONFIG_MEMORY_FAILURE somehow automagically
disallows address tagging...

Perhaps need in that case to detect if the address is tagged and do some
horror-show hack, maybe acquire lock and untag and drop lock in that case... Or
maybe make it arch-dependent since it seems only x86 needs to actually hold the
lock for untagging?

Other than this case I think we are good to just put:

	end = start + len;
	if (end < start)
		return -EINVAL;

Below the untagged_addr_remote() invocation?