lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZyKeAGgnuZiz3a4A@slm.duckdns.org>
Date: Wed, 30 Oct 2024 10:58:40 -1000
From: Tejun Heo <tj@...nel.org>
To: Peter Zijlstra <peterz@...radead.org>
Cc: mingo@...nel.org, juri.lelli@...hat.com, vincent.guittot@...aro.org,
	dietmar.eggemann@....com, rostedt@...dmis.org, bsegall@...gle.com,
	mgorman@...e.de, vschneid@...hat.com, void@...ifault.com,
	linux-kernel@...r.kernel.org
Subject: Re: [RFC][PATCH 2/6] sched: Employ sched_change guards

On Wed, Oct 30, 2024 at 04:12:57PM +0100, Peter Zijlstra wrote:
> --- a/kernel/sched/ext.c
> +++ b/kernel/sched/ext.c
...
> @@ -5206,18 +5202,15 @@ static int scx_ops_enable(struct sched_e
>  		const struct sched_class *old_class = p->sched_class;
>  		const struct sched_class *new_class =
>  			__setscheduler_class(p->policy, p->prio);
> -		struct sched_enq_and_set_ctx ctx;
>  
>  		if (old_class != new_class && p->se.sched_delayed)
> -			dequeue_task(task_rq(p), p, DEQUEUE_SLEEP | DEQUEE_DELAYED);
> -
> -		sched_deq_and_put_task(p, DEQUEUE_SAVE | DEQUEUE_MOVE, &ctx);
> -
> -		p->scx.slice = SCX_SLICE_DFL;
> -		p->sched_class = new_class;
> -		check_class_changing(task_rq(p), p, old_class);
> +			dequeue_task(task_rq(p), p, DEQUEUE_SLEEP | DEQUEUE_DELAYED);
>  
> -		sched_enq_and_set_task(&ctx);
> +		scoped_guard (sched_change, p, DEQUEUE_SAVE | DEQUEUE_MOVE) {
> +			p->scx.slice = SCX_SLICE_DFL;
> +			p->sched_class = new_class;
> +			check_class_changing(task_rq(p), p, old_class);
> +		}
>  
>  		check_class_changed(task_rq(p), p, old_class, p->prio);
>  	}

I get the following from missing update_rq_lock():

  rq->clock_update_flags < RQCF_ACT_SKIP
  WARNING: CPU: 2 PID: 1692 at kernel/sched/sched.h:1647 update_load_avg+0x7c3/0x8c0
  Modules linked in:
  CPU: 2 UID: 0 PID: 1692 Comm: runner Not tainted 6.12.0-rc5-work-00336-g9bfae8f5ca65-dirty #515
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS unknown 2/2/2022
  Sched_ext: maximal (enabling+all)
  RIP: 0010:update_load_avg+0x7c3/0x8c0
  Code: 00 4c 2b bb c8 01 00 00 40 f6 c5 02 0f 84 e7 f8 ff ff e9 fa f8 ff ff c6 05 28 1f 81 02 01 48 c7 c7 f9 c5 dd 82 e8 1d 04 fb ff <0f> 0b e9 aa f8 ff ff 0f 0b 41 83 be f0 0c 00 00 01 0f 86 8d f8 ff
  RSP: 0018:ffffc900003c7c60 EFLAGS: 00010086
  RAX: 0000000000000026 RBX: ffff88810163d400 RCX: 0000000000000027
  RDX: 0000000000000002 RSI: 00000000ffffdfff RDI: ffff888237c9b448
  RBP: 0000000000000000 R08: 0000000000001fff R09: ffffffff8368dff0
  R10: 0000000000005ffd R11: 0000000000000004 R12: ffffffff82edb890
  R13: ffff888100398080 R14: ffff888237c30180 R15: ffff888100398000
  FS:  00007f850b4006c0(0000) GS:ffff888237c80000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: 00007f84fc000020 CR3: 0000000103bfa000 CR4: 0000000000750eb0
  PKRU: 55555554
  Call Trace:
   <TASK>
   detach_task_cfs_rq+0x31/0xf0
   check_class_changed+0x29/0x70
   bpf_scx_reg+0xa72/0xc30
   bpf_struct_ops_link_create+0xf8/0x140
   __sys_bpf+0x348/0x510
   __x64_sys_bpf+0x18/0x20
   do_syscall_64+0x7b/0x140
   ? exc_page_fault+0x6b/0xb0
   entry_SYSCALL_64_after_hwframe+0x76/0x7e
  RIP: 0033:0x7f850c0551fd
  Code: ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d e3 fa 0c 00 f7 d8 64 89 01 48
  RSP: 002b:00007f850b3ffba8 EFLAGS: 00000202 ORIG_RAX: 0000000000000141
  RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f850c0551fd
  RDX: 0000000000000040 RSI: 00007f850b3ffdc0 RDI: 000000000000001c
  RBP: 00007f850b3ffbd0 R08: 0000000000000000 R09: 0000000000000000
  R10: 0000000000000001 R11: 0000000000000202 R12: 00007f850b4006c0
  R13: ffffffffffffff80 R14: 000000000000005f R15: 00007ffdf6c8de30
   </TASK>

The following patch fixes it. Thanks.

---
 kernel/sched/ext.c |    4 ++++
 1 file changed, 4 insertions(+)

--- a/kernel/sched/ext.c
+++ b/kernel/sched/ext.c
@@ -4496,6 +4496,8 @@ static void scx_ops_disable_workfn(struc
 		const struct sched_class *new_class =
 			__setscheduler_class(p->policy, p->prio);
 
+		update_rq_clock(task_rq(p));
+
 		if (old_class != new_class && p->se.sched_delayed)
 			dequeue_task(task_rq(p), p, DEQUEUE_SLEEP | DEQUEUE_DELAYED);
 
@@ -5208,6 +5210,8 @@ static int scx_ops_enable(struct sched_e
 		const struct sched_class *new_class =
 			__setscheduler_class(p->policy, p->prio);
 
+		update_rq_clock(task_rq(p));
+
 		if (old_class != new_class && p->se.sched_delayed)
 			dequeue_task(task_rq(p), p, DEQUEUE_SLEEP | DEQUEUE_DELAYED);

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ