lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Tue, 31 May 2022 16:23:35 +0200
From:   Frederic Weisbecker <frederic@...nel.org>
To:     nicolas saenz julienne <nsaenz@...nel.org>
Cc:     LKML <linux-kernel@...r.kernel.org>,
        Peter Zijlstra <peterz@...radead.org>,
        Phil Auld <pauld@...hat.com>,
        Alex Belits <abelits@...vell.com>,
        Xiongfeng Wang <wangxiongfeng2@...wei.com>,
        Neeraj Upadhyay <quic_neeraju@...cinc.com>,
        Thomas Gleixner <tglx@...utronix.de>,
        Yu Liao <liaoyu15@...wei.com>,
        Boqun Feng <boqun.feng@...il.com>,
        "Paul E . McKenney" <paulmck@...nel.org>,
        Marcelo Tosatti <mtosatti@...hat.com>,
        Paul Gortmaker <paul.gortmaker@...driver.com>,
        Uladzislau Rezki <uladzislau.rezki@...y.com>,
        Joel Fernandes <joel@...lfernandes.org>,
        Mark Rutland <mark.rutland@....com>
Subject: Re: [PATCH 20/21] rcu/context_tracking: Merge dynticks counter and
 context tracking states

On Mon, May 30, 2022 at 08:02:57PM +0200, nicolas saenz julienne wrote:
> Hi Frederic,
> 
> On Thu, 2022-05-19 at 16:58 +0200, Frederic Weisbecker wrote:
> > Updating the context tracking state and the RCU dynticks counter
> > atomically in a single operation is a first step towards improving CPU
> > isolation. This makes the context tracking state updates fully ordered
> > and therefore allow for later enhancements such as postponing some work
> > while a task is running isolated in userspace until it ever comes back
> > to the kernel.
> > 
> > The state field becomes divided in two parts:
> > 
> > 1) Two Lower bits for context tracking state:
> > 
> > 	CONTEXT_KERNEL = 0
> >    	CONTEXT_IDLE = 1,
> > 	CONTEXT_USER = 2,
> > 	CONTEXT_GUEST = 3,
> > 
> > 2) Higher bits for RCU eqs dynticks counting:
> > 
> >     RCU_DYNTICKS_IDX = 4
> > 
> >    The dynticks counting is always incremented by this value.
> >    (state & RCU_DYNTICKS_IDX) means we are NOT in an extended quiescent
> >    state. This makes the chance for a collision more likely between two
> >    RCU dynticks snapshots but wrapping up 28 bits of eqs dynticks
> >    increments still takes some bad luck (also rdp.dynticks_snap could be
> >    converted from int to long?)
> > 
> > Some RCU eqs functions have been renamed to better reflect their broader
> > scope that now include context tracking state.
> > 
> > Signed-off-by: Frederic Weisbecker <frederic@...nel.org>
> > Cc: Paul E. McKenney <paulmck@...nel.org>
> > Cc: Peter Zijlstra <peterz@...radead.org>
> > Cc: Thomas Gleixner <tglx@...utronix.de>
> > Cc: Neeraj Upadhyay <quic_neeraju@...cinc.com>
> > Cc: Uladzislau Rezki <uladzislau.rezki@...y.com>
> > Cc: Joel Fernandes <joel@...lfernandes.org>
> > Cc: Boqun Feng <boqun.feng@...il.com>
> > Cc: Nicolas Saenz Julienne <nsaenz@...nel.org>
> > Cc: Marcelo Tosatti <mtosatti@...hat.com>
> > Cc: Xiongfeng Wang <wangxiongfeng2@...wei.com>
> > Cc: Yu Liao<liaoyu15@...wei.com>
> > Cc: Phil Auld <pauld@...hat.com>
> > Cc: Paul Gortmaker<paul.gortmaker@...driver.com>
> > Cc: Alex Belits <abelits@...vell.com>
> > ---
> 
> While working on a feature on top of this series (IPI deferral stuff) I believe
> I've found a discrepancy on how context state is being updated:
> 
>  - When servicing an IRQ from user-space, we increment dynticks, and clear the
>    ct state to show we're in-kernel.
> 
>  - When servicing an IRQ from idle/guest or an NMI from any context we only
>    increment the dynticks counter. The ct state remains unchanged.

Hmm, an IRQ from userspace does:

    ct_user_enter()
    //run in user
        //-----IRQ
        ct_user_exit()
        ct_irq_enter()
        ct_irq_exit()
        ct_user_enter()
    //run in user

An IRQ from guest does:

    for (;;) {
         context_tracking_guest_enter()
        //vmrun
	//IRQ pending
        #VMEXIT
        context_tracking_guest_exit()
	local_irq_enable()
        ct_irq_enter()
        ct_irq_exit()
	local_irq_disable()
    }


    (although I see there is an "sti" right before "vmrun" so it looks
    possible to have ct_irq_enter() after context_tracking_guest_enter()
    if a host IRQ fires between the sti and the vmrun though I might be
    missing some kvm subtelty).

An IRQ from idle does just:

    ct_idle_enter()
        //IRQ
        ct_irq_enter()
        ct_irq_exit()
    ct_idle_exit()

So guest looks mostly ok to me (except for the little sti before vmrun for
which I have a doubt). But idle at least is an exception and CONTEXT_IDLE will
remain during the interrupt handling. It's not that trivial to handle the idle
case because ct_irq_exit() needs to know that it is called between
ct_idle_enter() and ct_idle_exit().

Thanks.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ