lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <ZbLWubUjuzFUKD5R@LeoBras>
Date: Thu, 25 Jan 2024 18:46:33 -0300
From: Leonardo Bras <leobras@...hat.com>
To: Tejun Heo <tj@...nel.org>
Cc: Leonardo Bras <leobras@...hat.com>,
	Lai Jiangshan <jiangshanlai@...il.com>,
	Marcelo Tosatti <mtosatti@...hat.com>,
	linux-kernel@...r.kernel.org
Subject: Re: [RFC PATCH v1 1/1] wq: Avoid using isolated cpus' timers on unbounded queue_delayed_work

On Thu, Jan 25, 2024 at 08:38:25AM -1000, Tejun Heo wrote:
> Hello,
> 
> On Wed, Jan 24, 2024 at 10:45:50PM -0300, Leonardo Bras wrote:
> > That's a good suggestion, but looking at workqueue_init_early() I see that, 
> > in short:
> > wq_unbound_cpumask = 	cpu_possible_mask & 
> > 			housekeeping_cpumask(HK_TYPE_WQ) & 
> > 			housekeeping_cpumask(HK_TYPE_DOMAIN) &
> > 			wq_cmdline_cpumask
> > 
> > So wq_unbound_cpumask relates to domain and workqueue cpu isolation.
> > 
> > In our case, we are using this to choose in which cpu is the timer we want 
> > to use, so it makes sense to use timer-related cpu isolation, instead.
> 
> - In the proposed code, when cpu == WORK_CPU_UNBOUND, it's always setting
>   cpu to housekeeping_any_cpu(HK_TYPE_TIMER). This may unnecessarily move
>   the timer and task away from local CPU. Preferring the local CPU would
>   likely make sense.
> 
> - If HK_TYPE_TIMER and workqueue masks may not agree, setting dwork->cpu to
>   the one returned from HK_TYPE_TIMER is likely problematic. That would
>   force __queue_work() to use that CPU instead of picking one from
>   wq_unbound_cpumask.
> 
> > As of today, your suggestion would work the same, as the only way to enable 
> > WQ cpu isolation is to use nohz_full, which also enables TIMER cpu 
> > isolation. But since that can change in the future, for any reason, I would 
> > suggest that we stick to using the HK_TYPE_TIMER cpumask.
> > 
> > I can now notice that this can end up introducing an issue: possibly 
> > running on a workqueue on a cpu outside of a valid wq_cmdline_cpumask.
> 
> Yeap.
> 
> > I would suggest fixing this in a couple ways:
> > 1 - We introduce a new cpumask which is basically 
> >     housekeeping_cpumask(HK_TYPE_DOMAIN) & wq_cmdline_cpumask, allowing us 
> >     to keep the timer interrupt in the same cpu as the scheduled function,
> > 2- We use the resulting cpu only to pick the right timer.
> > 
> > What are your thouhts on that?
> 
> How about something like the following instead?
> 
> - If current CPU is in HK_TYPE_TIMER, pick that CPU.
> 
> - If not, pick a CPU from HK_TYPE_TIMER.
> 
> - Do add_timer_on() on the selected CPU but leave dwork->cpu as
>   WORK_CPU_UNBOUND and leave that part to __queue_work().
> 
> Thanks.

It looks like a good idea to me.

It's basicaly (2) with "keep the timer in this cpu if it's not isolated", 
which seems the right thing to do.

Thanks!
Leo


> 
> -- 
> tejun
> 


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ