linux-kernel - Re: [PATCH] kvm/x86: Handle async PF in RCU read-side critical sections

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <42a732c2-e644-99dc-0fa0-81ebc919251c@redhat.com>
Date:   Mon, 2 Oct 2017 14:45:34 +0200
From:   Paolo Bonzini <pbonzini@...hat.com>
To:     paulmck@...ux.vnet.ibm.com, Boqun Feng <boqun.feng@...il.com>
Cc:     linux-kernel@...r.kernel.org, kvm@...r.kernel.org,
        Peter Zijlstra <peterz@...radead.org>,
        Radim Krčmář <rkrcmar@...hat.com>,
        Thomas Gleixner <tglx@...utronix.de>,
        Ingo Molnar <mingo@...hat.com>,
        "H. Peter Anvin" <hpa@...or.com>, x86@...nel.org
Subject: Re: [PATCH] kvm/x86: Handle async PF in RCU read-side critical
 sections

On 30/09/2017 19:15, Paul E. McKenney wrote:
> On Sat, Sep 30, 2017 at 07:41:56AM +0800, Boqun Feng wrote:
>> On Fri, Sep 29, 2017 at 04:43:39PM +0000, Paul E. McKenney wrote:
>>> Not to be repetitive, but if the schedule() is on the guest, this change
>>> really does silently break up an RCU read-side critical section on
>>> guests built with PREEMPT=n.  (Yes, they were already being broken,
>>> but it would be good to avoid this breakage in PREEMPT=n as well as
>>> in PREEMPT=y.)

Yes, you're right.  It's pretty surprising that it's never been reported.

>> Then probably adding !IS_ENABLED(CONFIG_PREEMPT) as one of the reason we
>> choose the halt path? Like:
>>
>> 	n.halted = is_idle_task(current) || preempt_count() > 1 ||
>> 		   !IS_ENABLED(CONFIG_PREEMPT) || rcu_preempt_depth();
>>
>>
>> But I think async PF could also happen while a user program is running?
>> Then maybe add a second parameter @user for kvm_async_pf_task_wait(),
>> like:
>>
>> 	kvm_async_pf_task_wait((u32)read_cr2(), user_mode(regs));
>>
>> and the halt condition becomes:
>>
>> 	n.halted = is_idle_task(current) || preempt_count() > 1 ||
>> 		   (!IS_ENABLED(CONFIG_PREEMPT) && !user) || rcu_preempt_depth();
>>
>> Thoughts?
> 
> This looks to me like it would cover it.  If !PREEMPT interrupt from
> kernel, we halt, which would prevent the sleep.
> 
> I take it that we get unhalted when the host gets things patched up?

Yes.  You get another page fault (this time it's a "page ready" page
fault rather than a "page not present" one), which has the side
effecting of ending the halt.

Paolo

>> A side thing is being broken already for PREEMPT=n means we maybe fail
>> to detect this in rcutorture? Then should we add a config with
>> KVM_GUEST=y and try to run some memory consuming things(e.g. stress
>> --vm) in the rcutorture kvm script simultaneously? Paolo, do you have
>> any test workload that could trigger async PF quickly?
> 
> I do not believe that have seen this in rcutorture, but I always run in
> a guest OS on a large-memory system (well, by my old-fashioned standards,
> anyway) that would be quite unlikely to evict a guest OS's pages.  Plus
> I tend to run on shared systems, and deliberately running them out of
> memory would not be particularly friendly to others using those systems.
> 
> I -do- run background scripts that are intended to force the host OS to
> preempt the guest OSes frequently, but I don't believe that this would
> cause that bug.
> 
> But it seems like it would make more sense to add this sort of thing to
> whatever KVM tests there are for host-side eviction of guest pages.