[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <b7912854-a7b1-caf5-aa23-e718318b0d7a@linux.ibm.com>
Date: Fri, 21 Jan 2022 15:42:48 +0100
From: Christian Borntraeger <borntraeger@...ux.ibm.com>
To: Mark Rutland <mark.rutland@....com>
Cc: Paolo Bonzini <pbonzini@...hat.com>, linux-kernel@...r.kernel.org,
Michael Ellerman <mpe@...erman.id.au>,
aleksandar.qemu.devel@...il.com, alexandru.elisei@....com,
anup.patel@....com, aou@...s.berkeley.edu, atish.patra@....com,
bp@...en8.de, catalin.marinas@....com, chenhuacai@...nel.org,
dave.hansen@...ux.intel.com, frankja@...ux.ibm.com,
frederic@...nel.org, gor@...ux.ibm.com, hca@...ux.ibm.com,
james.morse@....com, jmattson@...gle.com, joro@...tes.org,
luto@...nel.org, maz@...nel.org, mingo@...hat.com,
nsaenzju@...hat.com, palmer@...belt.com, paulmck@...nel.org,
paul.walmsley@...ive.com, peterz@...radead.org, seanjc@...gle.com,
suzuki.poulose@....com, svens@...ux.ibm.com, tglx@...utronix.de,
tsbogend@...ha.franken.de, vkuznets@...hat.com,
wanpengli@...cent.com, will@...nel.org,
Anup Patel <anup@...infault.org>,
Atish Patra <atishp@...shpatra.org>
Subject: Re: [PATCH v2 0/7] kvm: fix latent guest entry/exit bugs
Am 21.01.22 um 15:30 schrieb Mark Rutland:
> On Fri, Jan 21, 2022 at 03:17:01PM +0100, Christian Borntraeger wrote:
>>
>>
>> Am 21.01.22 um 10:53 schrieb Christian Borntraeger:
>>> Am 20.01.22 um 16:14 schrieb Christian Borntraeger:
>>>>
>>>>
>>>> Am 20.01.22 um 13:03 schrieb Mark Rutland:
>>>>> On Thu, Jan 20, 2022 at 12:28:09PM +0100, Paolo Bonzini wrote:
>>>>>> On 1/19/22 20:22, Mark Rutland wrote:
>>>>>>> I wonder, is the s390 guest entry/exit*preemptible* ?
>>>>>>>
>>>>>>> If a timer IRQ can preempt in the middle of the EQS, we wouldn't balance
>>>>>>> things before a ctx-switch to the idle thread, which would then be able
>>>>>>> to hit this.
>>>>>>>
>>>>>>> I'll need to go audit the other architectures for similar.
>>>>>>
>>>>>> They don't enable interrupts in the entry/exit path so they should be okay.
>>>>>
>>>>> True.
>>>>>
>>>>> So it sounds like for s390 adding an explicit preempt_{disable,enable}() is the
>>>>> right thing to do. I'll add that and explanatory commentary.
>>>>
>>>> That would not be trivial I guess. We do allow (and need) page faults on sie for guest
>>>> demand paging and
>>>>
>>>> this piece of arch/s390/mm/fault.c
>>>>
>>>> case GMAP_FAULT:
>>>> if (faulthandler_disabled() || !mm)
>>>> goto out;
>>>> break;
>>>> }
>>>>
>>>> would no longer work since faulthandler_disabled checks for the preempt count.
>>>>
>>>
>>>
>>> Something like this
>>>
>>>
>>> diff --git a/arch/s390/mm/fault.c b/arch/s390/mm/fault.c
>>> index d30f5986fa85..1c7d45346e12 100644
>>> --- a/arch/s390/mm/fault.c
>>> +++ b/arch/s390/mm/fault.c
>>> @@ -385,10 +385,18 @@ static inline vm_fault_t do_exception(struct pt_regs *regs, int access)
>>> return 0;
>>> goto out;
>>> case USER_FAULT:
>>> - case GMAP_FAULT:
>>> if (faulthandler_disabled() || !mm)
>>> goto out;
>>> break;
>>> + /*
>>> + * We know that we interrupted SIE and we are not in a IRQ.
>>> + * preemption might be disabled thus checking for in_atomic
>>> + * would result in failures
>>> + */
>>> + case GMAP_FAULT:
>>> + if (pagefault_disabled() || !mm)
>>> + goto out;
>>> + break;
>>> }
>>>
>>> perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS, 1, regs, address);
>>>
>>> seems to work with preemption disabled around sie. Not sure yet if this is correct.
>>
>>
>> No it does not work. scheduling while preemption is disabled.
>
> Hmm... which exceptions do we expect to take from a guest?
>
> I wonder if we can handle those more like other architectures by getting those
> to immediately return from the sie64a() call with some status code that we can
> handle outside of the guest_state_{enter,exit}_irqoff() critical section.
We take all kind of page faults (invalid->valid, ro->rw) on the sie instruction and
run that in the context of the pgm_check handler just like for userspace. the pgm_check
handler does a sie_exit (similar to the interrupt handlers) by setting the return IA.
> On arm64 in VHE mode, we swap our exception vectors when entering/exiting the
> guest to allow us to do that; I wonder if we could do similar here?
Powered by blists - more mailing lists