linux-kernel - Re: [PATCH 03/10] exit: Move oops specific logic from do_exit into make_task

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <87tuefwewa.fsf@email.froward.int.ebiederm.org>
Date:   Fri, 07 Jan 2022 12:59:33 -0600
From:   "Eric W. Biederman" <ebiederm@...ssion.com>
To:     Al Viro <viro@...iv.linux.org.uk>
Cc:     linux-kernel@...r.kernel.org, linux-arch@...r.kernel.org,
        Linus Torvalds <torvalds@...ux-foundation.org>,
        Alexey Gladkov <legion@...nel.org>,
        Kyle Huey <me@...ehuey.com>, Oleg Nesterov <oleg@...hat.com>,
        Kees Cook <keescook@...omium.org>,
        Heiko Carstens <hca@...ux.ibm.com>,
        Vasily Gorbik <gor@...ux.ibm.com>,
        Christian Borntraeger <borntraeger@...ibm.com>,
        Alexander Gordeev <agordeev@...ux.ibm.com>,
        Martin Schwidefsky <schwidefsky@...ibm.com>,
        Christoph Hellwig <hch@...radead.org>
Subject: Re: [PATCH 03/10] exit: Move oops specific logic from do_exit into
 make_task_dead

Al Viro <viro@...iv.linux.org.uk> writes:

> On Wed, Dec 08, 2021 at 02:25:25PM -0600, Eric W. Biederman wrote:
>> -	/*
>> -	 * If do_exit is called because this processes oopsed, it's possible
>> -	 * that get_fs() was left as KERNEL_DS, so reset it to USER_DS before
>> -	 * continuing. Amongst other possible reasons, this is to prevent
>> -	 * mm_release()->clear_child_tid() from writing to a user-controlled
>> -	 * kernel address.
>> -	 */
>> -	force_uaccess_begin();
>
> Are you sure about that one?  It shouldn't matter, but... it's a potential
> change for do_exit() from a kernel thread.  As it is, we have that
> force_uaccess_begin() for exiting threads and for kernel ones it's not
> a no-op.  I'm not concerned about attempted userland access after that
> point for those, obviously, but I'm not sure you won't step into something
> subtle here.
>
> I would prefer to split that particular change off into a separate commit...

Thank you for catching that.  I was leaning too much on the description
in the comment of why force_uaccess_begin is there.

Catching up on the state of set_fs/get_fs removal it appears like a lot
of progress has been made and on a lot of architectures set_fs/get_fs is
just gone, and force_uaccess_begin is a noop.

On architectures that still have set_fs/get_fs it appears all of the old
warts are present and kernel threads still run with set_fs(KERNEL_DS).

Assuming it won't be too much longer before the rest of the arches have
set_fs/get_fs removed it looks like it makes sense to leave the
force_uaccess_begin where it is, and just let force_uaccess_begin be
removed when set_fs/get_fs are removed from the tree.

Christoph does it look like the set_fs/get_fs removal work is going
to stall indefinitely on some architectures?  If so I think we want to
find a way to get kernel threads to run with set_fs(USER_DS) on the
stalled architectures.  Otherwise I think we have a real hazard of
introducing bugs that will only show up on the stalled architectures.

I finally understand now why when I updated set_child_tid in the kthread
code early in fork why x86 was fine another architecture was not.

Eric