lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Sat, 10 Nov 2018 15:04:36 -0800
From:   "Paul E. McKenney" <paulmck@...ux.ibm.com>
To:     Joel Fernandes <joel@...lfernandes.org>
Cc:     linux-kernel@...r.kernel.org, josh@...htriplett.org,
        rostedt@...dmis.org, mathieu.desnoyers@...icios.com,
        jiangshanlai@...il.com
Subject: Re: dyntick-idle CPU and node's qsmask

On Sat, Nov 10, 2018 at 01:46:59PM -0800, Joel Fernandes wrote:
> Hi Paul and everyone,
> 
> I was tracing/studying the RCU code today in paul/dev branch and noticed that
> for dyntick-idle CPUs, the RCU GP thread is clearing the rnp->qsmask
> corresponding to the leaf node for the idle CPU, and reporting a QS on their
> behalf.
> 
> rcu_sched-10    [003]    40.008039: rcu_fqs:              rcu_sched 792 0 dti
> rcu_sched-10    [003]    40.008039: rcu_fqs:              rcu_sched 801 2 dti
> rcu_sched-10    [003]    40.008041: rcu_quiescent_state_report: rcu_sched 805 5>0 0 0 3 0
> 
> That's all good but I was wondering if we can do better for the idle CPUs if
> we can some how not set the qsmask of the node in the first place. Then no
> reporting would be needed of quiescent state is needed for idle CPUs right?
> And we would also not need to acquire the rnp lock I think.
> 
> At least for a single node tree RCU system, it seems that would avoid needing
> to acquire the lock without complications. Anyway let me know your thoughts
> and happy to discuss this at the hallways of the LPC as well for folks
> attending :)

We could, but that would require consulting the rcu_data structure for
each CPU while initializing the grace period, thus increasing the number
of cache misses during grace-period initialization and also shortly after
for any non-idle CPUs.  This seems backwards on busy systems where each
CPU will with high probability report its own quiescent state before three
jiffies pass, in which case the cache misses on the rcu_data structures
would be wasted motion.

Now, this does increase overhead on mostly idle systems, but the theory
is that mostly idle systems are most able to absorb this extra overhead.

Thoughts?

							Thanx, Paul

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ