lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <c5b1c3d7-56ca-4afc-a831-045dba4beffa@os.amperecomputing.com>
Date: Mon, 9 Dec 2024 17:55:34 +0530
From: Ganapatrao Kulkarni <gankulkarni@...amperecomputing.com>
To: Marc Zyngier <maz@...nel.org>
Cc: kvmarm <kvmarm@...ts.linux.dev>, linux-arm-kernel@...ts.infradead.org,
 linux-kernel@...r.kernel.org, oliver.upton@...ux.dev,
 christoffer.dall@....com, suzuki.poulose@....com, will@...nel.org,
 catalin.marinas@....com, coltonlewis@...gle.com, joey.gouly@....com,
 yuzenghui@...wei.com, darren@...amperecomputing.com,
 vishnu@...amperecomputing.com
Subject: Re: [PATCH] KVM: arm64: nv: Set ISTATUS for emulated timers, If timer
 expired


Hi Marc,

On 09-12-2024 03:24 pm, Marc Zyngier wrote:
> Ganapatrao,
> 
> Did you notice that the Columbia list was killed over two years ago,
> as per ac107abef1976 ("KVM: arm64: Advertise new kvmarm mailing
> list")? Consider that script/get_maintainer.pl is your friend.

My bad, copy paste issue!.
> 
> Cc'ing the correct list instead,

Thanks.
> 
> On Mon, 09 Dec 2024 05:32:01 +0000,
> Ganapatrao Kulkarni <gankulkarni@...amperecomputing.com> wrote:
>>
>> During automated testing of Nested Virtualization using avocado-vt,
> 
> Which is not merged upstream. So what branch are you using? Based on
> what kernel version? On what HW? With which virtualisation features?
> 

Testing is done on Ampere's AmpereOne platform using 6.10 based kernel 
with NV patches from your repo.

>> it has been observed that during some boot test iterations,
>> the Guest-Hypervisor boot was getting crashed with a
>> synchronous exception while it is still booting EDK2.
>>
>> The test is launching Multiple instances of Guest-Hypervisor boot
> 
> Is the multiple instance aspect really relevant to the reproduction of
> the problem?

Not really, but it requires multiple attempts/iterations to hit the 
issue. Even with automated test, it was seen at some iteration out of 10 
to 15 iterations.

> 
>> and while booting, QEMU monitor issued the command "info register"
>> at regular intervals to take a register dump. To execute this
>> command, QEMU stops the run and does the register read of various
>> registers. While resuming the run, the function kvm_arm_timer_write()
>> writes back the saved CNTV_CTL_EL0 register with ISTATUS cleared always
> 
> It is userspace that causes this write-back, right? AFAICT, KVM never
> does that on its own.
> 
>> and resulting in the loss of pending interrupt for emulated timers.
> 
> How does a missing interrupt result in a synchronous exception in
> EDK2? In my experience, EDK2 panics if it sees spurious interrupts,
> not when it is missing interrupts (it just locks up, which is
> expected).

Not sure, why it is hitting exception, rather than hang at EDK2.
However, EDK2 timer handler code is ignoring the interrupt since ISTATUS 
is not set and not moving CVAL forward.

> 
>> In hardware based timers, ISTATUS is a RO/WI bit and gets set by the
>> h/w, if the condition is still met. However, in Nested-Virtualization
>> case, the Guest-Hypervisor's EDK2 is using an emulated virtual timer
>> and losing ISTATUS state and the interrupt forever.
> 
> Why is this specific to NV? Can't the same thing happen to the
> physical timer in a non-VHE configuration?
> 

You mean, emulated v-timer in non-VHE boot?
It might impact non-VHE case as well, not tried though.

>>
>> Adding fix in kvm_arm_timer_write to set ISTATUS for emulated
>> timers, if the timer expired already.
>>
>> Fixes: 81dc9504a700 ("KVM: arm64: nv: timers: Support hyp timer emulation")
> 
> Where is this coming from? This patch doesn't touch the code you are
> changing, so how can it be the source of your problems? As far as I
> can tell, this has been the case since 5c5196da4e966 ("KVM: arm/arm64:
> Support EL1 phys timer register access in set/get reg").

Thanks, 5c5196da4e966 seems more relevant!.

>> Co-developed-by: Vishnu Pajjuri <vishnu@...amperecomputing.com>
>> Signed-off-by: Vishnu Pajjuri <vishnu@...amperecomputing.com>
>> Signed-off-by: Ganapatrao Kulkarni <gankulkarni@...amperecomputing.com>
>> ---
>>   arch/arm64/kvm/arch_timer.c | 11 ++++++++++-
>>   1 file changed, 10 insertions(+), 1 deletion(-)
>>
>> diff --git a/arch/arm64/kvm/arch_timer.c b/arch/arm64/kvm/arch_timer.c
>> index 1215df590418..aca58113d790 100644
>> --- a/arch/arm64/kvm/arch_timer.c
>> +++ b/arch/arm64/kvm/arch_timer.c
>> @@ -1199,7 +1199,16 @@ static void kvm_arm_timer_write(struct kvm_vcpu *vcpu,
>>   		break;
>>   
>>   	case TIMER_REG_CTL:
>> -		timer_set_ctl(timer, val & ~ARCH_TIMER_CTRL_IT_STAT);
>> +		struct timer_map map;
>> +
>> +		val &= ~ARCH_TIMER_CTRL_IT_STAT;
>> +		get_timer_map(vcpu, &map);
>> +		/* Set ISTATUS bit for emulated timers, if timer expired. */
>> +		if (timer == map.emul_vtimer || timer == map.emul_ptimer) {
>> +			if (!kvm_timer_compute_delta(timer))
>> +				val |= ARCH_TIMER_CTRL_IT_STAT;
>> +		}
>> +		timer_set_ctl(timer, val);
>>   		break;
> 
> This really looks awfully complicated, when it is only a matter of
> recomputing the interrupt state, something that is at the core of the
> timer emulation. Why can't the following work:

Added check to handle ISTATUS for emulated timers only, since it is RO 
for hardware timers, thought it might confuse others(non-nv case).
Otherwise below diff from you should work as well. I will try it out.

> 
> diff --git a/arch/arm64/kvm/arch_timer.c b/arch/arm64/kvm/arch_timer.c
> index 895f09658ef83..bd6efafbe7109 100644
> --- a/arch/arm64/kvm/arch_timer.c
> +++ b/arch/arm64/kvm/arch_timer.c
> @@ -1298,13 +1298,17 @@ static void kvm_arm_timer_write(struct kvm_vcpu *vcpu,
>   				enum kvm_arch_timer_regs treg,
>   				u64 val)
>   {
> +	unsigned long tmp = val;
> +
>   	switch (treg) {
>   	case TIMER_REG_TVAL:
>   		timer_set_cval(timer, kvm_phys_timer_read() - timer_get_offset(timer) + (s32)val);
>   		break;
>   
>   	case TIMER_REG_CTL:
> -		timer_set_ctl(timer, val & ~ARCH_TIMER_CTRL_IT_STAT);
> +		__assign_bit(__ffs(ARCH_TIMER_CTRL_IT_STAT), &tmp,
> +			     kvm_timer_should_fire(timer));
> +		timer_set_ctl(timer, tmp);
>   		break;
>   
>   	case TIMER_REG_CVAL:
> 
> But overall, this very much looks like it is only papering over the
> real issue, which is that the *emulation* should regenerate the
> pending bit, and not rely on the userspace interface.
> 
> As far as I can tell, we already correctly compute the status bit on
> read (read_timer_ctl()), so the guest should always observe something
> consistent when it traps. We also *never* use the status bit as an
> input to the emulation, and always recompute it from scratch (it is
> only there for the benefit of the guest or userspace).
> 

For emulated timers, we are not asserting again by calling 
kvm_timer_update_irq in timer_emulate() until the level is down and 
ready for trigger again. This was done to fix high rate of spurious 
interrupts getting generated to V-Timer. Hence we are not able to 
recover, if once ISTATUS is lost.

> So I can't see how upstream is broken in at the moment, and you need
> to explain how this actually triggers. Ideally, with a standalone
> reproducer or a selftest.

We could reproduce the issue with the simple test/script.
On one shell, launch L1 using qemu with add-on option

"-monitor unix:gh_monitor,server,nowait

On another shell, while L1 boots and still in UEFI, run repeatedly the 
command (or put in a while 1 loop script)

"echo "info registers" | socat - unix-connect:gh_monitor > 
/tmp/info_registers"

With above steps we were able to hit the issue within few attempts.

-- 
Thanks,
Ganapat/GK


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ