lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87v8udwhc6.fsf@email.froward.int.ebiederm.org>
Date:   Tue, 10 May 2022 10:14:01 -0500
From:   "Eric W. Biederman" <ebiederm@...ssion.com>
To:     Thomas Gleixner <tglx@...utronix.de>
Cc:     linux-arch@...r.kernel.org, Tejun Heo <tj@...nel.org>,
        Peter Zijlstra <peterz@...radead.org>,
        Vincent Guittot <vincent.guittot@...aro.org>,
        Al Viro <viro@...IV.linux.org.uk>,
        Jens Axboe <axboe@...nel.dk>,
        Linus Torvalds <torvalds@...uxfoundation.org>,
        linux-kernel@...r.kernel.org, stable@...r.kernel.org,
        Максим Кутявин 
        <maximkabox13@...il.com>
Subject: Re: [PATCH 1/7] kthread: Don't allocate kthread_struct for init and
 umh

Thomas Gleixner <tglx@...utronix.de> writes:

> On Fri, May 06 2022 at 09:15, Eric W. Biederman wrote:
>>  	 * the init task will end up wanting to create kthreads, which, if
>>  	 * we schedule it before we create kthreadd, will OOPS.
>>  	 */
>> -	pid = kernel_thread(kernel_init, NULL, CLONE_FS);
>> +	pid = user_mode_thread(kernel_init, NULL, CLONE_FS);
>
> So init does not have PF_KTHREAD set anymore, which causes this to go
> sideways with a NULL pointer dereference in get_mm_counter() on next:

Well not after the change above, but in a later patch yes.

Patch 1/7 really gets us back to the previous status quo, where
I introduced the breakage.

>  get_mm_counter include/linux/mm.h:1996 [inline]
>  get_mm_rss include/linux/mm.h:2049 [inline]
>  task_nr_scan_windows.isra.0+0x23/0x120 kernel/sched/fair.c:1123
>  task_scan_min kernel/sched/fair.c:1144 [inline]
>  task_scan_start+0x6c/0x400 kernel/sched/fair.c:1150
>  task_tick_numa kernel/sched/fair.c:2944 [inline]
>  task_tick_fair+0xaeb/0xef0 kernel/sched/fair.c:11186
>  scheduler_tick+0x20a/0x5e0 kernel/sched/core.c:5380
>
>  https://lore.kernel.org/lkml/0000000000008a9fbb05dea76400@google.com
>
> because the fence in task_tick_numa():
>
>  	if ((curr->flags & (PF_EXITING | PF_KTHREAD)) || work->next != work)
> 		return;
>
> is not longer sufficient. It needs also to bail if !curr->mm.

Agreed.  I proposed a patch to do just that a little while ago.

> I'm worried that there are more of these issues lurking. Haven't looked
> yet.

I looked earlier and I missed this one.  I am going to look again today,
along with applying the obvious fix to task_tick_numa.

I don't think there are many but when the code has evolved into a shape
that is not easy to understand things occasionally slip through when the
abstractions are made clear to understand.  The reason to rework the
code and make it clear is that once the code has evolved to a point of
many subtle issues making any change is brittle.

Eric

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ