lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Y7tNPF4x+HYJUwjK@lothringen>
Date:   Mon, 9 Jan 2023 00:09:48 +0100
From:   Frederic Weisbecker <frederic@...nel.org>
To:     Joel Fernandes <joel@...lfernandes.org>
Cc:     paulmck@...nel.org, Zqiang <qiang1.zhang@...el.com>,
        quic_neeraju@...cinc.com, rcu@...r.kernel.org,
        linux-kernel@...r.kernel.org
Subject: Re: [PATCH] rcu: Fix missing TICK_DEP_MASK_RCU_EXP dependency check

On Sat, Jan 07, 2023 at 09:55:22PM -0500, Joel Fernandes wrote:
> 
> 
> > On Jan 7, 2023, at 9:48 PM, Joel Fernandes <joel@...lfernandes.org> wrote:
> > 
> > 
> >>> On Jan 7, 2023, at 5:11 PM, Frederic Weisbecker <frederic@...nel.org> wrote:
> >>> 
> >>> On Fri, Jan 06, 2023 at 07:01:28PM -0500, Joel Fernandes wrote:
> >>> (lost html content)
> > 
> > My problem is the iPhone wises up when I put a web link in an email. I want to look into smtp relays but then if I spent time on fixing that, I might not get time to learn from emails like these... 
> > 
> >> I can't find a place where the exp grace period sends an IPI to
> >> CPUs slow to report a QS. But anyway you really need the tick to poll
> >> periodically on the CPU to chase a quiescent state.
> > 
> > Ok.
> > 
> >> Now arguably it's probably only useful when CONFIG_PREEMPT_COUNT=y
> >> and rcu_exp_handler() has interrupted a preempt-disabled or bh-disabled
> >> section. Although rcu_exp_handler() sets TIF_RESCHED, which is handled
> >> by preempt_enable() and local_bh_enable() when CONFIG_PREEMPT=y.
> >> So probably it's only useful when CONFIG_PREEMPT_COUNT=y and CONFIG_PREEMPT=n
> >> (and there is also PREEMPT_DYNAMIC to consider).
> > 
> > Makes sense. I think I was missing this use case and was going by the general design of exp grace periods.  I was incorrectly assuming the IPIs were being sent repeatedly for hold out CPUs, which is not the case I think. But that would another way to fix it?
> > 
> > But yeah I get your point, the first set of IPIs missed it, so we need the rescue-tick for long non-rcu_read_lock() implicit critical sections.. 
> > 
> >> If CONFIG_PREEMPT_COUNT=n, the tick can only report idle and user
> >> as QS, but those are already reported explicitly on ct_kernel_exit() ->
> >> rcu_preempt_deferred_qs().
> > 
> > Oh hmm, because that function is a NOOP for PREEMPT_COUNT=y and PREEMPT=n and will not report the deferred QS?  Maybe it should then. However I think the tick is still useful if after the preempt disabled section, will still did not exit the kernel.
> 
> I think meant I here, an atomic section (like bh or Irq disabled). There is no such thing as disabling preemption for CONFIG_PREEMPT=n. Or maybe I am confused again.  This RCU thing…

Right, so when CONFIG_PREEMPT_COUNT=n, there is no way for a tick to tell if the
the interrupted code is safely considered as a QS. That's because
preempt_disable() <-> preempt_enable() are no-ops so the whole kernel is
assumed non-preemptible, and therefore the whole kernel is a READ side critical
section, except for the explicit points reporting a QS.

The only exception is when the tick interrupts idle (or user with
nohz_full). But we already have an exp QS reported on idle (and user with
nohz_full) entry through ct_kernel_exit(), and that happens on all RCU_TREE
configs (PREEMPT or not). Therefore the tick doesn't appear to be helpful at
all on a nohz_full CPU with CONFIG_PREEMPT_COUNT=n.

I suggest we don't bother optimizing that case though...

To summarize:

1) nohz_full && !CONFIG_PREEMPT_COUNT && !CONFIG_PREEMPT_RCU:
  Tick isn't helpful. It can only report idle/user QS, but that is
  already reported explicitly.

2) nohz_full && CONFIG_PREEMPT_COUNT && !CONFIG_PREEMPT_RCU:
  Tick is very helpful because it can tell if the kernel is in
  a QS state.

3) nohz_full && CONFIG_PREEMPT_RCU:
   Tick doesn't appear to be helpful because:
       - If the rcu_exp_handler() fires in an rcu_read_lock'ed section, then the
         exp QS is reported on rcu_read_unlock()
       - If the rcu_exp_handler() fires in a preempt/bh disabled section,
         TIF_RESCHED is forced which is handled on preempt/bh re-enablement,
	 reporting a QS.
   
  
The case 2) is a niche, only useful for debugging. But anyway I'm not sure it's
worth changing/optimizing the current state. Might be worth add a comment
though.

Thanks.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ