[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87zgoswkym.fsf@email.froward.int.ebiederm.org>
Date: Wed, 22 Dec 2021 12:30:57 -0600
From: ebiederm@...ssion.com (Eric W. Biederman)
To: Nathan Chancellor <nathan@...nel.org>
Cc: linux-kernel@...r.kernel.org, linux-arch@...r.kernel.org,
Linus Torvalds <torvalds@...ux-foundation.org>,
Alexey Gladkov <legion@...nel.org>,
Kyle Huey <me@...ehuey.com>, Oleg Nesterov <oleg@...hat.com>,
Kees Cook <keescook@...omium.org>,
Al Viro <viro@...iv.linux.org.uk>,
Heiko Carstens <hca@...ux.ibm.com>,
Vasily Gorbik <gor@...ux.ibm.com>,
Christian Borntraeger <borntraeger@...ibm.com>,
Alexander Gordeev <agordeev@...ux.ibm.com>,
Martin Schwidefsky <schwidefsky@...ibm.com>
Subject: Re: [PATCH 09/10] kthread: Ensure struct kthread is present for all kthreads
Nathan Chancellor <nathan@...nel.org> writes:
> Hi Eric,
>
> On Wed, Dec 08, 2021 at 02:25:31PM -0600, Eric W. Biederman wrote:
>> Today the rules are a bit iffy and arbitrary about which kernel
>> threads have struct kthread present. Both idle threads and thread
>> started with create_kthread want struct kthread present so that is
>> effectively all kernel threads. Make the rule that if PF_KTHREAD
>> and the task is running then struct kthread is present.
>>
>> This will allow the kernel thread code to using tsk->exit_code
>> with different semantics from ordinary processes.
>>
>> To make ensure that struct kthread is present for all
>> kernel threads move it's allocation into copy_process.
>>
>> Add a deallocation of struct kthread in exec for processes
>> that were kernel threads.
>>
>> Move the allocation of struct kthread for the initial thread
>> earlier so that it is not repeated for each additional idle
>> thread.
>>
>> Move the initialization of struct kthread into set_kthread_struct
>> so that the structure is always and reliably initailized.
>>
>> Clear set_child_tid in free_kthread_struct to ensure the kthread
>> struct is reliably freed during exec. The function
>> free_kthread_struct does not need to clear vfork_done during exec as
>> exec_mm_release called from exec_mmap has already cleared vfork_done.
>>
>> Signed-off-by: "Eric W. Biederman" <ebiederm@...ssion.com>
>
> This patch as commit 40966e316f86 ("kthread: Ensure struct kthread is
> present for all kthreads") in -next causes an ARCH=arm
> multi_v5_defconfig kernel to fail to boot in QEMU. I had to apply commit
> 6692c98c7df5 ("fork: Stop protecting back_fork_cleanup_cgroup_lock with
> CONFIG_NUMA") to get it to build and I applied commit dd621ee0cf8e
> ("kthread: Warn about failed allocations for the init kthread") to avoid
> the known runtime warning.
>
> $ make -skj"$(nproc)" ARCH=arm CROSS_COMPILE=arm-linux-gnueabi- distclean multi_v5_defconfig all
>
> $ qemu-system-arm \
> -initrd rootfs.cpio \
> -append earlycon \
> -machine palmetto-bmc \
> -no-reboot \
> -dtb arch/arm/boot/dts/aspeed-bmc-opp-palmetto.dtb \
> -display none \
> -kernel arch/arm/boot/zImage \
> -m 512m \
> -nodefaults \
> -serial mon:stdio
> qemu-system-arm: warning: nic ftgmac100.0 has no peer
> qemu-system-arm: warning: nic ftgmac100.1 has no peer
> Booting Linux on physical CPU 0x0
> Linux version 5.16.0-rc1-00016-g40966e316f86-dirty (nathan@...hlinux-ax161) (arm-linux-gnueabi-gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2) #1 PREEMPT Wed Dec 22 18:08:53 UTC 2021
> CPU: ARM926EJ-S [41069265] revision 5 (ARMv5TEJ), cr=00093177
> CPU: VIVT data cache, VIVT instruction cache
> OF: fdt: Machine model: Palmetto BMC
> earlycon: ns16550a0 at MMIO 0x1e784000 (options '')
> printk: bootconsole [ns16550a0] enabled
> Memory policy: Data cache writethrough
> cma: Reserved 16 MiB at 0x5b000000
> Zone ranges:
> DMA [mem 0x0000000040000000-0x000000005edfffff]
> Normal empty
> HighMem [mem 0x000000005ee00000-0x000000005fffffff]
> Movable zone start for each node
> Early memory node ranges
> node 0: [mem 0x0000000040000000-0x000000005bffffff]
> node 0: [mem 0x000000005c000000-0x000000005dffffff]
> node 0: [mem 0x000000005e000000-0x000000005edfffff]
> node 0: [mem 0x000000005ee00000-0x000000005fffffff]
> Initmem setup node 0 [mem 0x0000000040000000-0x000000005fffffff]
> Built 1 zonelists, mobility grouping on. Total pages: 130084
> Kernel command line: earlycon
> Dentry cache hash table entries: 65536 (order: 6, 262144 bytes, linear)
> Inode-cache hash table entries: 32768 (order: 5, 131072 bytes, linear)
> mem auto-init: stack:off, heap alloc:off, heap free:off
> Memory: 433140K/524288K available (9628K kernel code, 2019K rwdata, 2368K rodata, 340K init, 661K bss, 74764K reserved, 16384K cma-reserved, 0K highmem)
> SLUB: HWalign=32, Order=0-3, MinObjects=0, CPUs=1, Nodes=1
> rcu: Preemptible hierarchical RCU implementation.
> rcu: RCU event tracing is enabled.
> Trampoline variant of Tasks RCU enabled.
> rcu: RCU calculated value of scheduler-enlistment delay is 10 jiffies.
> NR_IRQS: 16, nr_irqs: 16, preallocated irqs: 16
> i2c controller registered, irq 16
> random: get_random_bytes called from start_kernel+0x408/0x624 with crng_init=0
> clocksource: FTTMR010-TIMER2: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 79635851949 ns
> sched_clock: 32 bits at 24MHz, resolution 41ns, wraps every 89478484971ns
> Switching to timer-based delay loop, resolution 41ns
> Console: colour dummy device 80x30
> printk: console [tty0] enabled
> printk: bootconsole [ns16550a0] disabled
>
> After that, it just hangs.
>
> The rootfs is available at https://github.com/ClangBuiltLinux/boot-utils
> in the images/arm folder.
>
> If there is any more information that I can provide or changes to test,
> please let me know.
Well crap. I hate to hear my code is causing problems like this.
This is however a very good bug report, which I very much appreciate.
I think I have enough information. I will see if I can reproduce this
and track down what is happening.
Have you by any chance tried linux-next with just these changes backed
out?
Eric
Powered by blists - more mailing lists