lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZbG9TjHAMJYIvwsg@LeoBras>
Date: Wed, 24 Jan 2024 22:45:50 -0300
From: Leonardo Bras <leobras@...hat.com>
To: Tejun Heo <tj@...nel.org>
Cc: Leonardo Bras <leobras@...hat.com>,
	Lai Jiangshan <jiangshanlai@...il.com>,
	Marcelo Tosatti <mtosatti@...hat.com>,
	linux-kernel@...r.kernel.org
Subject: Re: [RFC PATCH v1 1/1] wq: Avoid using isolated cpus' timers on unbounded queue_delayed_work

On Wed, Jan 24, 2024 at 11:47:29AM -1000, Tejun Heo wrote:
> On Wed, Jan 24, 2024 at 05:29:37AM -0300, Leonardo Bras wrote:
> > +	/*
> > +	 * If the work is cpu-unbound, and cpu isolation is in place, only
> > +	 * schedule use timers from housekeeping cpus. In favor of avoiding
> > +	 * cacheline bouncing, run the WQ in the same cpu as the timer.
> > +	 */
> > +	if (cpu == WORK_CPU_UNBOUND && housekeeping_enabled(HK_TYPE_TIMER))
> > +		cpu = housekeeping_any_cpu(HK_TYPE_TIMER);
> 
> Would it make more sense to use wq_unbound_cpumask?

Hello Tejun, thank you for this reply!

That's a good suggestion, but looking at workqueue_init_early() I see that, 
in short:
wq_unbound_cpumask = 	cpu_possible_mask & 
			housekeeping_cpumask(HK_TYPE_WQ) & 
			housekeeping_cpumask(HK_TYPE_DOMAIN) &
			wq_cmdline_cpumask

So wq_unbound_cpumask relates to domain and workqueue cpu isolation.

In our case, we are using this to choose in which cpu is the timer we want 
to use, so it makes sense to use timer-related cpu isolation, instead.

As of today, your suggestion would work the same, as the only way to enable 
WQ cpu isolation is to use nohz_full, which also enables TIMER cpu 
isolation. But since that can change in the future, for any reason, I would 
suggest that we stick to using the HK_TYPE_TIMER cpumask.

I can now notice that this can end up introducing an issue: possibly 
running on a workqueue on a cpu outside of a valid wq_cmdline_cpumask.

I would suggest fixing this in a couple ways:
1 - We introduce a new cpumask which is basically 
    housekeeping_cpumask(HK_TYPE_DOMAIN) & wq_cmdline_cpumask, allowing us 
    to keep the timer interrupt in the same cpu as the scheduled function,
2- We use the resulting cpu only to pick the right timer.

What are your thouhts on that?

Thank you!
Leo


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ