linux-kernel - Re: [RFC PATCH v8] sched: Fix performance regression introduced by mm

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20230420060004.GA52173@ziqianlu-desk2>
Date:   Thu, 20 Apr 2023 14:00:04 +0800
From:   Aaron Lu <aaron.lu@...el.com>
To:     Mathieu Desnoyers <mathieu.desnoyers@...icios.com>
CC:     Peter Zijlstra <peterz@...radead.org>,
        <linux-kernel@...r.kernel.org>, Olivier Dion <odion@...icios.com>,
        <michael.christie@...cle.com>
Subject: Re: [RFC PATCH v8] sched: Fix performance regression introduced by
 mm_cid

On Mon, Apr 17, 2023 at 11:08:31AM -0400, Mathieu Desnoyers wrote:

> +/*
> + * Save a snapshot of the current runqueue time of this cpu
> + * with the per-cpu cid value, allowing to estimate how recently it was used.
> + */
> +static inline void mm_cid_snapshot_time(struct mm_struct *mm)
>  {
> -	lockdep_assert_irqs_disabled();
> -	if (cid < 0)
> -		return;
> -	raw_spin_lock(&mm->cid_lock);
> -	__cpumask_clear_cpu(cid, mm_cidmask(mm));
> -	raw_spin_unlock(&mm->cid_lock);
> +	struct rq *rq = this_rq();
> +	struct mm_cid *pcpu_cid;
> +
> +	lockdep_assert_rq_held(rq);

On wake up path when src_cid is migrated to dst_cid, this rq is the waker
rq and is not locked, the wakee's dst_rq is locked.

I got below warning on a VM boot with v8:

[    2.496964] ------------[ cut here ]------------
[    2.497499] WARNING: CPU: 13 PID: 99 at kernel/sched/sched.h:1357 sched_mm_cid_migrate_to+0x2ce/0x330
[    2.498478] Modules linked in:
[    2.498481] CPU: 13 PID: 99 Comm: kworker/u32:5 Tainted: G        W 6.3.0-rc7-00002-gb8012ce004f4 #32
[    2.498484] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.1-2.fc36 04/01/2014
[    2.498485] Workqueue: events_unbound flush_to_ldisc
[    2.501094] RIP: 0010:sched_mm_cid_migrate_to+0x2ce/0x330
[    2.501099] Code: 45 89 74 24 08 48 83 c4 18 5b 5d 41 5c 41 5d 41 5e 41 5f c3 cc cc cc cc 48 8d 7b 18 be ff ff ff ff7
[    2.503101] RSP: 0018:ffffc900003d7ac0 EFLAGS: 00010046
[    2.503608] RAX: 0000000000000000 RBX: ffff88842f3fe700 RCX: 0000000000000001
[    2.504313] RDX: 0000000000000000 RSI: ffffffff823ccffd RDI: ffffffff8244e8fe
[    2.505000] RBP: ffffe8ffff20c268 R08: 00000000954b8e6a R09: 00000000950aa3ff
[    2.505680] R10: 00000000f950aa3f R11: ffff88810005e900 R12: ffffe8fffe60c268
[    2.506406] R13: ffff88810005e900 R14: 0000000000000000 R15: 00000000ffffffff
[    2.506935] FS:  0000000000000000(0000) GS:ffff88842f200000(0000) knlGS:0000000000000000
[    2.507375] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    2.507678] CR2: 00007f0aadff6db8 CR3: 0000000106ba2002 CR4: 0000000000770ee0
[    2.508050] PKRU: 55555554
[    2.508209] Call Trace:
[    2.508342]  <TASK>
[    2.508492]  ttwu_do_activate+0x129/0x300
[    2.508727]  try_to_wake_up+0x2b7/0x8a0
[    2.508963]  ep_autoremove_wake_function+0x11/0x50
[    2.509259]  __wake_up_common+0x83/0x1a0
[    2.509481]  __wake_up_common_lock+0x81/0xd0
[    2.509738]  ep_poll_callback+0x147/0x310
[    2.509965]  __wake_up_common+0x83/0x1a0
[    2.510185]  __wake_up_common_lock+0x81/0xd0
[    2.510463]  n_tty_receive_buf_common+0x235/0x6a0
[    2.510728]  tty_port_default_receive_buf+0x3d/0x70
[    2.510987]  flush_to_ldisc+0x9b/0x1a0
[    2.511191]  process_one_work+0x27a/0x560
[    2.511420]  worker_thread+0x4f/0x3b0
[    2.511657]  ? __pfx_worker_thread+0x10/0x10
[    2.511930]  kthread+0xf2/0x120
[    2.512108]  ? __pfx_kthread+0x10/0x10
[    2.512340]  ret_from_fork+0x29/0x50
[    2.512552]  </TASK>
[    2.512679] ---[ end trace 0000000000000000 ]---

$ ./scripts/faddr2line ../guest_debug/vmlinux sched_mm_cid_migrate_to+0x2ce
sched_mm_cid_migrate_to+0x2ce/0x330:
lockdep_assert_rq_held at kernel/sched/sched.h:1357
(inlined by) mm_cid_snapshot_time at kernel/sched/sched.h:3355
(inlined by) sched_mm_cid_migrate_to at kernel/sched/core.c:11666

> +	pcpu_cid = this_cpu_ptr(mm->pcpu_cid);
> +	WRITE_ONCE(pcpu_cid->time, rq->clock);
> +}
> +

Thanks,
Aaron