lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <20241125021222.356881-1-adamli@os.amperecomputing.com>
Date: Mon, 25 Nov 2024 02:12:19 +0000
From: Adam Li <adamli@...amperecomputing.com>
To: peterz@...radead.org,
	mingo@...hat.com,
	juri.lelli@...hat.com,
	vincent.guittot@...aro.org
Cc: dietmar.eggemann@....com,
	rostedt@...dmis.org,
	bsegall@...gle.com,
	mgorman@...e.de,
	vschneid@...hat.com,
	linux-kernel@...r.kernel.org,
	patches@...erecomputing.com,
	cl@...ux.com,
	Adam Li <adamli@...amperecomputing.com>
Subject: [PATCH 0/2] sched/fair: Fix NEXT_BUDDY panic and clean up comments

When running Specjbb workload with NEXT_BUDDY enabled, kernel warning and
panic may be triggered. We should not set next buddy if sched_delayed is set.

The 'last' and 'skip' buddy are obsoleted by EEVDF. Update the comments in
pick_next_entity().

[  124.972623] ------------[ cut here ]------------
[  124.977300] cfs_rq->next->sched_delayed
[  124.977310] WARNING: CPU: 51 PID: 2150 at kernel/sched/fair.c:5621 pick_task_fair+0x130/0x150
[  125.049547] CPU: 51 UID: 0 PID: 2150 Comm: kworker/51:1 Tainted: G            E      6.12.0.adam+ #1
[  125.058678] Tainted: [E]=UNSIGNED_MODULE
[  125.062591] Hardware name: IEI NF5280R7/Mitchell MB, BIOS 4.4.3.1 10/16/2024
[  125.069629] Workqueue:  0x0 (mm_percpu_wq)
[  125.073721] pstate: 634000c9 (nZCv daIF +PAN -UAO +TCO +DIT -SSBS BTYPE=--)
[  125.080671] pc : pick_task_fair+0x130/0x150
[  125.084841] lr : pick_task_fair+0x130/0x150
[  125.089013] sp : ffff8000ab41bc10
[  125.092315] x29: ffff8000ab41bc10 x28: 0000000000000000 x27: 0000000000000000
[  125.099440] x26: ffff000123bd8788 x25: 0000000000000402 x24: 0000000000000001
[  125.106565] x23: ffff000123bd8000 x22: ffff007dfef5cd00 x21: ffff007dfef5cd80
[  125.113689] x20: ffff007dfef5cd80 x19: ffff2001ab20a780 x18: 0000000000000006
[  125.120815] x17: 0000000000000000 x16: 0000000000000000 x15: ffff8000ab41b5e0
[  125.127938] x14: 0000000000000000 x13: 646579616c65645f x12: 64656863733e2d74
[  125.135062] x11: fffffffffc000000 x10: ffff207dfac9b420 x9 : ffff80008014ed60
[  125.142185] x8 : 00000000ffdfffff x7 : ffff207dfac80000 x6 : 000000000000122c
[  125.149309] x5 : ffff007dfef49408 x4 : 40000000ffe0122c x3 : ffff807d7d673000
[  125.156433] x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffff000123bd8000
[  125.163561] Call trace:
[  125.165996]  pick_task_fair+0x130/0x150 (P)
[  125.170167]  pick_task_fair+0x130/0x150 (L)
[  125.174338]  pick_next_task_fair+0x48/0x3c0
[  125.178512]  __pick_next_task+0x4c/0x220
[  125.182426]  pick_next_task+0x44/0x980
[  125.186163]  __schedule+0x3d0/0x628
[  125.189645]  schedule+0x3c/0xe0
[  125.192776]  worker_thread+0x1a4/0x368
[  125.196516]  kthread+0xfc/0x110
[  125.199647]  ret_from_fork+0x10/0x20
[  125.203213] ---[ end trace 0000000000000000 ]---
[  125.207818] ------------[ cut here ]------------


[  297.371198] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000051
[  297.406112] CPU: 116 UID: 0 PID: 10328 Comm: Grizzly-worker( Tainted: G        W   E      6.12.0.adam+ #1
[  297.414884] Mem abort info:
[  297.424437] Tainted: [W]=WARN, [E]=UNSIGNED_MODULE
[  297.427219]   ESR = 0x0000000096000005
[  297.431997] Hardware name: IEI NF5280R7/Mitchell MB, BIOS 4.4.3.1 10/16/2024
[  297.435734]   EC = 0x25: DABT (current EL), IL = 32 bits
[  297.442770] pstate: a34000c9 (NzCv daIF +PAN -UAO +TCO +DIT -SSBS BTYPE=--)
[  297.448069]   SET = 0, FnV = 0
[  297.455018] pc : pick_task_fair+0x50/0x150
[  297.458060]   EA = 0, S1PTW = 0
[  297.462144] lr : pick_task_fair+0x50/0x150
[  297.465274]   FSC = 0x05: level 1 translation fault
[  297.469358] sp : ffff800101d93ae0
[  297.474223] Data abort info:
[  297.477526] x29: ffff800101d93ae0
[  297.480395]   ISV = 0, ISS = 0x00000005, ISS2 = 0x00000000
[  297.480395]  x28: 0000000000000009
[  297.483703]  x27: 0000000000000000
[  297.489177]   CM = 0, WnR = 0, TnD = 0, TagAccess = 0
[  297.492567]
[  297.495956]   GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
[  297.500996] x26: ffff006da4381b08
[  297.502477] user pgtable: 4k pages, 48-bit VAs, pgdp=0000000198b3b000
[  297.507777]  x25: 0000000000000080
[  297.511080] [0000000000000051] pgd=08000001c0636403
[  297.517509]  x24: 0000000000000001
[  297.520899] , p4d=08000001c0636403
[  297.525765]
[  297.529155] , pud=0000000000000000
[  297.532545] x23: ffff006da4381380
[  297.534025]
[  297.537415]  x22: ffff007dff7fed00 x21: ffff007dff7fed80
[  297.547496] x20: ffff000167f60c00 x19: 0000000000000000 x18: 0000000000000006
[  297.554621] x17: ffff8000820b3be8 x16: 0000000087c17f9e x15: ffff800083d53690
[  297.561745] x14: 0000000000000004 x13: ffff800081df4ac8 x12: 0000000000000000
[  297.568868] x11: ffff200111a3f0b0 x10: ffff200111a3efc8 x9 : ffff800080109e48
[  297.575992] x8 : 00000000000000b8 x7 : 0000000000000074 x6 : 0000000000000002
[  297.583115] x5 : 0000000000000002 x4 : 0000000000000002 x3 : 0000000000000000
[  297.590239] x2 : fffffffffffffff0 x1 : 0000000000000000 x0 : 0000000000000000
[  297.597362] Call trace:
[  297.599795]  pick_task_fair+0x50/0x150 (P)
[  297.603879]  pick_task_fair+0x50/0x150 (L)
[  297.607963]  pick_next_task_fair+0x30/0x3c0
[  297.612134]  __pick_next_task+0x4c/0x220
[  297.616045]  pick_next_task+0x44/0x980
[  297.619782]  __schedule+0x3d0/0x628
[  297.623259]  do_task_dead+0x50/0x60
[  297.626736]  do_exit+0x28c/0x410
[  297.629955]  do_group_exit+0x3c/0xa0
[  297.633518]  get_signal+0x8c4/0x8d0
[  297.636996]  do_signal+0x9c/0x270
[  297.640299]  do_notify_resume+0xe0/0x198
[  297.644212]  el0_svc+0xf4/0x170
[  297.647342]  el0t_64_sync_handler+0x10c/0x138
[  297.651687]  el0t_64_sync+0x1ac/0x1b0
[  297.655339] Code: d503201f 1400002a aa1403e0 97ffda0b (39414401)
[  297.661439] ---[ end trace 0000000000000000 ]---
[  297.726593] Kernel panic - not syncing: Oops: Fatal exception

Adam Li (2):
  sched/fair: Fix panic if NEXT_BUDDY enabled
  sched/fair: Update comments regarding last and skip buddy

 kernel/sched/fair.c | 10 +++-------
 1 file changed, 3 insertions(+), 7 deletions(-)

-- 
2.25.1


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ