lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <bfdc80bd-0be6-f591-e998-c3ad65283404@redhat.com>
Date:   Sun, 12 Jun 2022 21:23:14 +0200
From:   Paolo Bonzini <pbonzini@...hat.com>
To:     paulmck@...nel.org
Cc:     "zhangfei.gao@...mail.com" <zhangfei.gao@...mail.com>,
        Zhangfei Gao <zhangfei.gao@...aro.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        rcu@...r.kernel.org, Lai Jiangshan <jiangshanlai@...il.com>,
        Josh Triplett <josh@...htriplett.org>,
        Mathieu Desnoyers <mathieu.desnoyers@...icios.com>,
        Matthew Wilcox <willy@...radead.org>,
        Shameerali Kolothum Thodi 
        <shameerali.kolothum.thodi@...wei.com>, mtosatti@...hat.com,
        sheng.yang@...el.com
Subject: Re: Commit 282d8998e997 (srcu: Prevent expedited GPs and blocking
 readers from consuming CPU) cause qemu boot slow

On 6/12/22 20:49, Paul E. McKenney wrote:
>>
>> 1) kvm->irq_srcu is hardly relying on the "sleepable" part; it has readers
>> that are very very small, but it needs extremely fast detection of grace
>> periods; see commit 719d93cd5f5c ("kvm/irqchip: Speed up
>> KVM_SET_GSI_ROUTING", 2014-05-05) which split it off kvm->srcu.  Readers are
>> not so frequent.
>>
>> 2) kvm->srcu is nastier because there are readers all the time.  The
>> read-side critical section are still short-ish, but they need the sleepable
>> part because they access user memory.
> 
> Which one of these two is in play in this case?

The latter, kvm->srcu; though at boot time both are hammered on quite a 
bit (and then essentially not at all).

For the one involved it's still pretty rare for readers to sleep, but it 
cannot be excluded.  Most critical sections are short, I'd guess in the 
thousands of clock cycles but I can add some instrumentation tomorrow 
(or anyway before Tuesday).

> The problem was not internal to SRCU, but rather due to the fact
> that kernel live patching (KLP) had problems with the CPU-bound tasks
> resulting from repeated synchronize_rcu_expedited() invocations.

I see.  Perhaps only add to the back-to-back counter if the 
synchronize_srcu_expedited() takes longer than a jiffy? This would 
indirectly check if syncronize_srcu_expedited() readers are actually 
blocking.  KVM uses syncronize_srcu_expedited() because it expects it to 
take very little (again I'll get hard numbers asap).

Paolo

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ