lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  PHC 
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Sat, 07 Mar 2020 16:52:17 +0100
From:   Thomas Gleixner <>
To:     Andy Lutomirski <>
Cc:     Andy Lutomirski <>,
        LKML <>, X86 ML <>,
        Paolo Bonzini <>, KVM <>,
        "Paul E. McKenney" <>
Subject: Re: [patch 2/2] x86/kvm: Sanitize kvm_async_pf_task_wait()

Andy Lutomirski <> writes:
> On Sat, Mar 7, 2020 at 2:01 AM Thomas Gleixner <> wrote:
>> > What’s the local_irq_disable() here for? I would believe a
>> > lockdep_assert_irqs_disabled() somewhere in here would make sense.
>> > (Yes, I see you copied this from the old code. It’s still nonsense.)
>> native_safe_halt() does:
>>          STI
>>          HLT
>> So the irq disable is required as the loop should exit with interrupts
>> disabled.
> Oops, should have looked at what native_safe_halt() does.
>> > I also find it truly bizarre that hlt actually works in this context.
>> > Does KVM in fact wake a HLTed guest that HLTed with IRQs off when a
>> > pending async pf is satisfied?  This would make sense if the wake
>> > event were an interrupt, but it’s not according to Paolo.
>> See above. safe halt enables interrupts, which means IF == 1 and the
>> host sanity check for IF == 1 is satisfied.
>> In fact, if e.g. some regular interrupt arrives before the page becomes
>> available and the guest entered the halt loop because the fault happened
>> inside a RCU read side critical section with preemption enabled, then
>> the interrupt might wake up another task, set need resched and this
>> other task can run.
> Now I'm confused again.  Your patch is very careful not to schedule if
> we're in an RCU read-side critical section, but the regular preemption
> code (preempt_schedule_irq, etc) seems to be willing to schedule
> inside an RCU read-side critical section.  Why is the latter okay but
> not the async pf case?

Preemption is fine, but voluntary schedule not. voluntary schedule might
end up in idle if this is the last runnable task.

> Ignoring that, this still seems racy:
> nested #PF telling us to wake up
> #PF returns

You will say Ooops, should have looked .... when I tell you that the
above cannot happen. From the SDM:

  If IF = 0, maskable hardware interrupts remain inhibited on the
  instruction boundary following an execution of STI.

Otherwise safe_halt would not work at all :)

> doesn't this result in putting the CPU asleep for no good reason until
> the next interrupt hits?

No :)

>> > All this being said, the only remotely sane case is when regs->flags
>> > has IF==1. Perhaps this code should actually do:
>> >
>> > WARN_ON(!(regs->flags & X86_EFLAGS_IF));
>> Yes, that want's to be somewhere early and also cover the async wake
>> case. Neither wake nor wait can be injected when IF == 0.
> Sadly, wrmsr to turn off async pf will inject wakeups even if IF == 0.

WHAT? That's fundamentally broken. Can you point me to the code in



Powered by blists - more mailing lists