lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date:   Fri, 11 Nov 2022 15:50:04 +0100
From:   Juergen Gross <jgross@...e.com>
To:     Jan Beulich <jbeulich@...e.com>
Cc:     Boris Ostrovsky <boris.ostrovsky@...cle.com>,
        Thomas Gleixner <tglx@...utronix.de>,
        Ingo Molnar <mingo@...hat.com>, Borislav Petkov <bp@...en8.de>,
        Dave Hansen <dave.hansen@...ux.intel.com>, x86@...nel.org,
        "H. Peter Anvin" <hpa@...or.com>,
        Oleksandr Tyshchenko <oleksandr_tyshchenko@...m.com>,
        Maximilian Heyne <mheyne@...zon.de>,
        xen-devel@...ts.xenproject.org,
        LKML <linux-kernel@...r.kernel.org>,
        Stefano Stabellini <sstabellini@...nel.org>
Subject: Re: [PATCH v4] x86/xen: Add support for
 HVMOP_set_evtchn_upcall_vector

On 11.11.22 14:17, Jan Beulich wrote:
> On 11.11.2022 13:44, Juergen Gross wrote:
>> On 11.11.22 10:01, Juergen Gross wrote:
>>> On 08.11.22 17:26, Jan Beulich wrote:
>>>> On 03.11.2022 16:41, Jan Beulich wrote:
>>>>> On 03.11.2022 14:38, Jan Beulich wrote:
>>>>>> On 29.07.2022 09:04, Jane Malalane wrote:
>>>>>>> @@ -125,6 +130,9 @@ DEFINE_IDTENTRY_SYSVEC(sysvec_xen_hvm_callback)
>>>>>>>    {
>>>>>>>        struct pt_regs *old_regs = set_irq_regs(regs);
>>>>>>> +    if (xen_percpu_upcall)
>>>>>>> +        ack_APIC_irq();
>>>>>>> +
>>>>>>>        inc_irq_stat(irq_hv_callback_count);
>>>>>>>        xen_hvm_evtchn_do_upcall();
>>>>>>> @@ -168,6 +176,15 @@ static int xen_cpu_up_prepare_hvm(unsigned int cpu)
>>>>>>>        if (!xen_have_vector_callback)
>>>>>>>            return 0;
>>>>>>> +    if (xen_percpu_upcall) {
>>>>>>> +        rc = xen_set_upcall_vector(cpu);
>>>>>>
>>>>>>   From all I can tell at least for APs this happens before setup_local_apic().
>>>>>> With there being APIC interaction in this operation mode, as seen e.g. in
>>>>>> the earlier hunk above, I think this is logically wrong. And it leads to
>>>>>> apic_pending_intr_clear() issuing its warning: The vector registration, as
>>>>>> an intentional side effect, marks the vector as pending. Unless IRQs were
>>>>>> enabled at any point between the registration and the check, there's
>>>>>> simply no way for the corresponding IRR bit to be dealt with (by
>>>>>> propagating to ISR when the interrupt is delivered, and then being cleared
>>>>>> from ISR by EOI).
>>>>>
>>>>> With Roger's help I now have a pointer to osstest also exposing the issue:
>>>>>
>>>>> http://logs.test-lab.xenproject.org/osstest/logs/174592/test-amd64-amd64-xl-pvhv2-intel/huxelrebe0---var-log-xen-console-guest-debian.guest.osstest.log.gz
>>>>
>>>> I've noticed only now that my mail to Jane bounced, and I'm now told
>>>> she's no longer in her role at Citrix. Since I don't expect to have time
>>>> to investigate an appropriate solution here, may I ask whether one of
>>>> the two of you could look into this, being the maintainers of this code?
>>>
>>> I think the correct way to handle this would be:
>>>
>>> - rename CPUHP_AP_ARM_XEN_STARTING to CPUHP_AP_XEN_STARTING
>>> - move the xen_set_upcall_vector() call to a new hotplug callback
>>>     registered for CPUHP_AP_XEN_STARTING (this can be done even
>>>     conditionally only if xen_percpu_upcall is set)
>>>
>>> Writing a patch now ...
>>
>> For the APs this is working as expected.
>>
>> The boot processor seems to be harder to fix. The related message is being
>> issued even with interrupts being on when setup_local_APIC() is called.
> 
> Hmm, puzzling: I don't recall having seen the message for the BSP. Which
> made me assume (without having actually checked) that ...
> 
>> I've tried to register the callback only after the setup_local_APIC() call,
> 
> ... it's already happening afterwards in that case.
> 
>> but this results in a system hang when the APs are started.
>>
>> Any ideas?
> 
> Not really, to be honest.

I might be wrong here, but is a bit set in IRR plus interrupts enabled
enough to make the kernel happy? The local APIC isn't enabled yet when
apic_pending_intr_clear() is being called, so IMHO the hypervisor will
never propagate the bit to ISR.

I didn't find any specific information in the SDM regarding "accepting
an interrupt" of a disabled local APIC, but maybe I didn't find the
relevant part of the manual.


Juergen

Download attachment "OpenPGP_0xB0DE9DD628BF132F.asc" of type "application/pgp-keys" (3099 bytes)

Download attachment "OpenPGP_signature" of type "application/pgp-signature" (496 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ