[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-Id: <20230914151839.3635-1-wang.yong12@zte.com.cn>
Date: Thu, 14 Sep 2023 23:18:39 +0800
From: Yong Wang <yongw.pur@...il.com>
To: chrubis@...e.cz, naresh.kamboju@...aro.org
Cc: alex.bennee@...aro.org, anders.roxell@...aro.org, arnd@...db.de,
linux-kernel@...r.kernel.org, linux-pm@...r.kernel.org,
ltp@...ts.linux.it, mdoucha@...e.cz, peterz@...radead.org,
vincent.guittot@...aro.org, wegao@...e.com, wang.yong12@....com.cn,
yang.yang29@....com.cn, ran.xiaokai@....com.cn
Subject: LTP: cfs_bandwidth01: Unable to handle kernel NULL pointer dereference
Hello!
>Following kernel crash noticed on Linux stable-rc 6.5.3-rc1 on qemu-arm64 while
>running LTP sched tests cases.
>
>This is not always reproducible.
I also encountered this problem on linux 5.10 on arm64 environment.
The prompt information is as follows:
[ 2893.003795] ==================================================================
[ 2893.003822] BUG: KASAN: null-ptr-deref in pick_next_task_fair+0x130/0x4e0
[ 2893.003880] Read of size 8 at addr 0000000000000080 by task ksoftirqd/0/12
[ 2893.003901]
[ 2893.003914] CPU: 0 PID: 12 Comm: ksoftirqd/0 Tainted: P O 5.10.59-rt52#1
[ 2893.003959] Call trace:
[ 2893.003968] dump_backtrace+0x0/0x2e8
[ 2893.004009] show_stack+0x18/0x28
[ 2893.004032] dump_stack+0x104/0x174
[ 2893.004067] kasan_report+0x1d0/0x258
[ 2893.004098] __asan_load8+0x94/0xd0
[ 2893.004126] pick_next_task_fair+0x130/0x4e0
[ 2893.004164] __schedule+0x220/0xbd0
[ 2893.004192] schedule+0xec/0x1a0
[ 2893.004216] smpboot_thread_fn+0x124/0x548
[ 2893.004246] kthread+0x24c/0x278
[ 2893.004277] ret_from_fork+0x10/0x34
[ 2893.004306] ==================================================================
[ 2893.004325] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000080
[ 2893.152267] Mem abort info:
[ 2893.152639] ESR = 0x96000004
[ 2893.153045] EC = 0x25: DABT (current EL), IL = 32 bits
[ 2893.153739] SET = 0, FnV = 0
[ 2893.154143] EA = 0, S1PTW = 0
[ 2893.154560] Data abort info:
[ 2893.154940] ISV = 0, ISS = 0x00000004
[ 2893.155443] CM = 0, WnR = 0
[ 2893.155838] user pgtable: 4k pages, 48-bit VAs, pgdp=0000000188edb000
The source code where the problem occurs corresponds to:
se = pick_next_entity(cfs_rq, curr);
cfs_rq = group_cfs_rq(se); //se is NULL!
It is found that pick_next_entity returns null, so null-ptr-dere appears when accessing the members of se later.
But it is not clear under what circumstances pick_next_entity returns null.
In addition, in my environment, the following operations often recur:
stress-ng -c 8 --cpu-load 100 --sched fifo --sched-prio 1 --cpu-method pi -t 900 &
runltp -s cfs_bandwidth01
Hope it helps to solve the problem.
Thanks.
Powered by blists - more mailing lists