lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <6b6569e2-895c-69a8-0c15-838bbe1d3233@efficios.com>
Date:   Thu, 20 Apr 2023 08:38:48 -0400
From:   Mathieu Desnoyers <mathieu.desnoyers@...icios.com>
To:     Aaron Lu <aaron.lu@...el.com>
Cc:     Peter Zijlstra <peterz@...radead.org>,
        linux-kernel@...r.kernel.org, Olivier Dion <odion@...icios.com>,
        michael.christie@...cle.com
Subject: Re: [RFC PATCH v8] sched: Fix performance regression introduced by
 mm_cid

On 2023-04-20 02:00, Aaron Lu wrote:
> On Mon, Apr 17, 2023 at 11:08:31AM -0400, Mathieu Desnoyers wrote:
> 
>> +/*
>> + * Save a snapshot of the current runqueue time of this cpu
>> + * with the per-cpu cid value, allowing to estimate how recently it was used.
>> + */
>> +static inline void mm_cid_snapshot_time(struct mm_struct *mm)
>>   {
>> -	lockdep_assert_irqs_disabled();
>> -	if (cid < 0)
>> -		return;
>> -	raw_spin_lock(&mm->cid_lock);
>> -	__cpumask_clear_cpu(cid, mm_cidmask(mm));
>> -	raw_spin_unlock(&mm->cid_lock);
>> +	struct rq *rq = this_rq();
>> +	struct mm_cid *pcpu_cid;
>> +
>> +	lockdep_assert_rq_held(rq);
> 
> On wake up path when src_cid is migrated to dst_cid, this rq is the waker
> rq and is not locked, the wakee's dst_rq is locked.

Doh, yes, good catch thanks! This one was puzzling me.

I'll fix this in my next version.

Thanks,

Mathieu

> 
> I got below warning on a VM boot with v8:
> 
> [    2.496964] ------------[ cut here ]------------
> [    2.497499] WARNING: CPU: 13 PID: 99 at kernel/sched/sched.h:1357 sched_mm_cid_migrate_to+0x2ce/0x330
> [    2.498478] Modules linked in:
> [    2.498481] CPU: 13 PID: 99 Comm: kworker/u32:5 Tainted: G        W 6.3.0-rc7-00002-gb8012ce004f4 #32
> [    2.498484] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.1-2.fc36 04/01/2014
> [    2.498485] Workqueue: events_unbound flush_to_ldisc
> [    2.501094] RIP: 0010:sched_mm_cid_migrate_to+0x2ce/0x330
> [    2.501099] Code: 45 89 74 24 08 48 83 c4 18 5b 5d 41 5c 41 5d 41 5e 41 5f c3 cc cc cc cc 48 8d 7b 18 be ff ff ff ff7
> [    2.503101] RSP: 0018:ffffc900003d7ac0 EFLAGS: 00010046
> [    2.503608] RAX: 0000000000000000 RBX: ffff88842f3fe700 RCX: 0000000000000001
> [    2.504313] RDX: 0000000000000000 RSI: ffffffff823ccffd RDI: ffffffff8244e8fe
> [    2.505000] RBP: ffffe8ffff20c268 R08: 00000000954b8e6a R09: 00000000950aa3ff
> [    2.505680] R10: 00000000f950aa3f R11: ffff88810005e900 R12: ffffe8fffe60c268
> [    2.506406] R13: ffff88810005e900 R14: 0000000000000000 R15: 00000000ffffffff
> [    2.506935] FS:  0000000000000000(0000) GS:ffff88842f200000(0000) knlGS:0000000000000000
> [    2.507375] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [    2.507678] CR2: 00007f0aadff6db8 CR3: 0000000106ba2002 CR4: 0000000000770ee0
> [    2.508050] PKRU: 55555554
> [    2.508209] Call Trace:
> [    2.508342]  <TASK>
> [    2.508492]  ttwu_do_activate+0x129/0x300
> [    2.508727]  try_to_wake_up+0x2b7/0x8a0
> [    2.508963]  ep_autoremove_wake_function+0x11/0x50
> [    2.509259]  __wake_up_common+0x83/0x1a0
> [    2.509481]  __wake_up_common_lock+0x81/0xd0
> [    2.509738]  ep_poll_callback+0x147/0x310
> [    2.509965]  __wake_up_common+0x83/0x1a0
> [    2.510185]  __wake_up_common_lock+0x81/0xd0
> [    2.510463]  n_tty_receive_buf_common+0x235/0x6a0
> [    2.510728]  tty_port_default_receive_buf+0x3d/0x70
> [    2.510987]  flush_to_ldisc+0x9b/0x1a0
> [    2.511191]  process_one_work+0x27a/0x560
> [    2.511420]  worker_thread+0x4f/0x3b0
> [    2.511657]  ? __pfx_worker_thread+0x10/0x10
> [    2.511930]  kthread+0xf2/0x120
> [    2.512108]  ? __pfx_kthread+0x10/0x10
> [    2.512340]  ret_from_fork+0x29/0x50
> [    2.512552]  </TASK>
> [    2.512679] ---[ end trace 0000000000000000 ]---
> 
> $ ./scripts/faddr2line ../guest_debug/vmlinux sched_mm_cid_migrate_to+0x2ce
> sched_mm_cid_migrate_to+0x2ce/0x330:
> lockdep_assert_rq_held at kernel/sched/sched.h:1357
> (inlined by) mm_cid_snapshot_time at kernel/sched/sched.h:3355
> (inlined by) sched_mm_cid_migrate_to at kernel/sched/core.c:11666
> 
>> +	pcpu_cid = this_cpu_ptr(mm->pcpu_cid);
>> +	WRITE_ONCE(pcpu_cid->time, rq->clock);
>> +}
>> +
> 
> Thanks,
> Aaron

-- 
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ