lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CAAhV-H6fobEooN5tWf0nWoyc96sL9VzfAA7BxJz=R7WWDt-k2w@mail.gmail.com>
Date: Fri, 28 Nov 2025 10:04:54 +0800
From: Huacai Chen <chenhuacai@...nel.org>
To: Bibo Mao <maobibo@...ngson.cn>
Cc: "open list:LOONGARCH" <loongarch@...ts.linux.dev>, Paolo Bonzini <pbonzini@...hat.com>, 
	Sean Christopherson <seanjc@...gle.com>, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v3 0/6] KVM: LoongArch: selftests: Add timer test case

On Fri, Nov 28, 2025 at 9:11 AM Bibo Mao <maobibo@...ngson.cn> wrote:
>
>
>
> On 2025/11/27 下午10:38, Huacai Chen wrote:
> > On Thu, Nov 27, 2025 at 9:02 PM Bibo Mao <maobibo@...ngson.cn> wrote:
> >>
> >>
> >>
> >> On 2025/11/27 下午3:11, Huacai Chen wrote:
> >>> On Thu, Nov 27, 2025 at 3:03 PM Bibo Mao <maobibo@...ngson.cn> wrote:
> >>>>
> >>>>
> >>>>
> >>>> On 2025/11/27 下午2:54, Huacai Chen wrote:
> >>>>> On Thu, Nov 27, 2025 at 2:48 PM Bibo Mao <maobibo@...ngson.cn> wrote:
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> On 2025/11/27 下午2:42, Huacai Chen wrote:
> >>>>>>> On Thu, Nov 27, 2025 at 2:21 PM Bibo Mao <maobibo@...ngson.cn> wrote:
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> On 2025/11/27 上午10:51, Huacai Chen wrote:
> >>>>>>>>> On Thu, Nov 27, 2025 at 10:48 AM Bibo Mao <maobibo@...ngson.cn> wrote:
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> On 2025/11/27 上午10:45, Huacai Chen wrote:
> >>>>>>>>>>> On Thu, Nov 27, 2025 at 10:37 AM Bibo Mao <maobibo@...ngson.cn> wrote:
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>> On 2025/11/27 上午10:09, Huacai Chen wrote:
> >>>>>>>>>>>>> On Thu, Nov 27, 2025 at 9:08 AM Bibo Mao <maobibo@...ngson.cn> wrote:
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> On 2025/11/26 下午9:43, Huacai Chen wrote:
> >>>>>>>>>>>>>>> On Mon, Nov 24, 2025 at 10:17 AM Bibo Mao <maobibo@...ngson.cn> wrote:
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> On 2025/11/24 上午10:03, Huacai Chen wrote:
> >>>>>>>>>>>>>>>>> On Mon, Nov 24, 2025 at 9:58 AM Bibo Mao <maobibo@...ngson.cn> wrote:
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> On 2025/11/21 下午10:08, Huacai Chen wrote:
> >>>>>>>>>>>>>>>>>>> Hi, Bibo,
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> On Thu, Nov 20, 2025 at 2:58 PM Bibo Mao <maobibo@...ngson.cn> wrote:
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>> This patchset adds timer test case for LoongArch system, it is based
> >>>>>>>>>>>>>>>>>>>> on common arch_timer test case. And it includes one-shot and period mode
> >>>>>>>>>>>>>>>>>>>> timer interrupt test, software emulated timer function and time counter
> >>>>>>>>>>>>>>>>>>>> test.
> >>>>>>>>>>>>>>>>>>> I test this series on top of 6.18-rc6 with Loongson-3A5000, sometimes
> >>>>>>>>>>>>>>>>>>> it passes, sometimes I get:
> >>>>>>>>>>>>>>>>>>> [root@...ora kvm]# ./arch_timer
> >>>>>>>>>>>>>>>>>>> Random seed: 0x6b8b4567
> >>>>>>>>>>>>>>>>>>> Guest assert failed,  vcpu 2; stage; 0; iter: 1
> >>>>>>>>>>>>>>>>>>> ==== Test Assertion Failure ====
> >>>>>>>>>>>>>>>>>>>             loongarch/arch_timer.c:79: irq_iter == 0
> >>>>>>>>>>>>>>>>>>>             pid=60138 tid=60142 errno=4 - Interrupted system call
> >>>>>>>>>>>>>>>>>>>                1  0x00000001200037cf: test_vcpu_run 于 arch_timer.c:70
> >>>>>>>>>>>>>>>>>>>                2  0x00007ffff2449f27: ?? ??:0
> >>>>>>>>>>>>>>>>>>>                3  0x00007ffff24c0633: ?? ??:0
> >>>>>>>>>>>>>>>>>>>             irq_iter = 0x1.
> >>>>>>>>>>>>>>>>>>>             Guest period timer interrupt was not triggered within the specified
> >>>>>>>>>>>>>>>>>>>             interval, try to increase the error margin by [-e] option.
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> Is this as expected, or something is wrong?
> >>>>>>>>>>>>>>>>>> There is problem with that. In generic the vCPU task is rescheduled on
> >>>>>>>>>>>>>>>>>> other CPUs or preempted, so period timer interrupt is not handled in
> >>>>>>>>>>>>>>>>>> specified time.
> >>>>>>>>>>>>>>>>> Then this series need to be updated, or problem comes from other places?
> >>>>>>>>>>>>>>>> I think this series need be updated, test success criteria with period
> >>>>>>>>>>>>>>>> timer need consider this situation. Let me check how to handle this.
> >>>>>>>>>>>>>>> Any updates available?
> >>>>>>>>>>>>>> It can be solved by modifying udelay() method with get_cycles() or using
> >>>>>>>>>>>>>> cpu loop calculation method.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> --- a/tools/testing/selftests/kvm/include/loongarch/arch_timer.h
> >>>>>>>>>>>>>> +++ b/tools/testing/selftests/kvm/include/loongarch/arch_timer.hYes, no common part for it, but it can be a common problem. If other
> >>>>>>>>>> architectures have problems they should also modify their own
> >>>>>>>>>> __delay(), right?
> >>>>>>>>>>>>>> @@ -71,10 +71,17 @@ static inline void timer_irq_disable(void)
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>          static inline void __delay(uint64_t cycles)
> >>>>>>>>>>>>>>          {
> >>>>>>>>>>>>>> -       uint64_t start = timer_get_cycles();
> >>>>>>>>>>>>>> -
> >>>>>>>>>>>>>> -       while ((timer_get_cycles() - start) < cycles)
> >>>>>>>>>>>>>> -               cpu_relax();
> >>>>>>>>>>>>>> +       uint64_t start, next, loops = 0;
> >>>>>>>>>>>>>> +
> >>>>>>>>>>>>>> +       start = timer_get_cycles();
> >>>>>>>>>>>>>> +       while (loops < cycles) {
> >>>>>>>>>>>>>> +               next = timer_get_cycles();
> >>>>>>>>>>>>>> +               /* only count one cycle if VM is preempted */
> >>>>>>>>>>>>>> +               if (next > start) {
> >>>>>>>>>>>>>> +                       loops++;
> >>>>>>>>>>>>>> +                       start = next;
> >>>>>>>>>>>>>> +               }
> >>>>>>>>>>>>>> +       }
> >>>>>>>>>>>>>>          }
> >>>>>>>>>>>>> Looks good. But ARM64 and RISC-V also use a simple implementation of
> >>>>>>>>>>>> there is no period test on them.
> >>>>>>>>>>> I think the one-shot test can also have this problem if the CPU is
> >>>>>>>>>>> preempted for a very long time.
> >>>>>>>>>>>
> >>>>>>>>>>>>> __delay(). So should this problem be thought of as a common problem?
> >>>>>>>>>>>>> If yes, maybe we can keep __delay() as is and wait for the common
> >>>>>>>>>>>>> parts to be fixed.
> >>>>>>>>>>>> Also there is no common udelay() API, it is arch specific. Someone may
> >>>>>>>>>>>> argue that skipping stolen cycles is not generic for __delay(), other
> >>>>>>>>>>>> test cases want accurate cycles rather than skipping stolen cycles. It
> >>>>>>>>>>>> is timer test case specific.
> >>>>>>>>>>>>
> >>>>>>>>>>>> Or adding another api __delay_loops() or keep it as is and wait for
> >>>>>>>>>>>> other architectures, there should be no common part for it.
> >>>>>>>>>>> Yes, no common part for it, but it can be a common problem. If other
> >>>>>>>>>>> architectures have problems they should also modify their own
> >>>>>>>>>>> __delay(), right?
> >>>>>>>>>> yes, what to do then?
> >>>>>>>>> Merge window is coming, let's keep it as is. And this problem only
> >>>>>>>>> exist when the background load is high (so preemption happens easily),
> >>>>>>>>> I think this is not the usual case.
> >>>>>>>> Another method is to modify period timer test case, calling udelay with
> >>>>>>>> loop times rather than one time, something like this:
> >>>>>>>> --- a/tools/testing/selftests/kvm/loongarch/arch_timer.c
> >>>>>>>> +++ b/tools/testing/selftests/kvm/loongarch/arch_timer.c
> >>>>>>>> @@ -75,7 +75,7 @@ static void guest_test_oneshot_timer(uint32_t cpu)
> >>>>>>>>
> >>>>>>>>       static void guest_test_period_timer(uint32_t cpu)
> >>>>>>>>       {
> >>>>>>>> -       uint32_t irq_iter;
> >>>>>>>> +       uint32_t irq_iter, config_iter;
> >>>>>>>>              uint64_t us;
> >>>>>>>>              struct test_vcpu_shared_data *shared_data = &vcpu_shared_data[cpu];
> >>>>>>>>
> >>>>>>>> @@ -84,7 +84,8 @@ static void guest_test_period_timer(uint32_t cpu)
> >>>>>>>>              us = msecs_to_usecs(test_args.timer_period_ms) +
> >>>>>>>> test_args.timer_err_margin_us;
> >>>>>>>>              timer_set_next_cmp_ms(test_args.timer_period_ms, true);
> >>>>>>>>              /* Setup a timeout for the interrupt to arrive */
> >>>>>>>> -       udelay(us * test_args.nr_iter);
> >>>>>>>> +       for (config_iter = 0; config_iter < test_args.nr_iter;
> >>>>>>>> config_iter++)
> >>>>>>>> +               udelay(us);
> >>>>>>> This can reduce the probability but still cause problem if the
> >>>>>>> background CPU load is very high. So I suggest keep it as is.
> >>>>>> If merge window is close, one fix can be post in later.
> >>>>>>
> >>>>>> Even if the background CPU load is high, timer interrupt will happen
> >>>>>> every time udelay() is called. So total times of timer interrupt
> >>>>>> triggered will meet the test case requirements.
> >>>>> I said it only "reduce the probability" because the one-shot test case
> >>>>> is also observed errors. If we use "retry method" to fix, then
> >>>>> one-shot test case is also needed.
> >>>> one-shot test case has already used split udelay() method. I observed
> >>>> one-shot test case failure on 3D6000 also, it is KVM timer emulation issue.
> >>> Yes, one-shot test case use udelay() in a loop, but it also calls
> >>> __GUEST_ASSERT() in a loop. So it reports errors even if only one of
> >>> udelay() timeouts.
> >> Sorry, I do not understand what is your meaning. what is the problem
> >> with one-shot test case? Do you mean there is problem with one-shot
> >> testcase also?
> > Sorry for confusing description.
> > Yes, one-shot test has problems and the reason is similar to period
> > test, if I understand correctly.
> > 1. The error report comes from __GUEST_ASSERT(), right?
> > 2. The reason of period test error is interrupt hasn't triggered
> > during udelay(), right?
> > 3. For the one-shot test, there is also a udelay() and a
> > __GUEST_ASSERT() in every iteration, right?
> > 4. In any iteration of the loop, if a interrupt hasn't triggered
> > during udelay(), __GUEST_ASSERT() will report errors, right?
> all are right. only that in one loop why a interrupt is not triggered
> during udelay()? is the udelay() too short or other reasons?
>
> If udelay() take longer time than expected, timer interrupt should
> happen between udelay() and  __GUEST_ASSERT(), is that right?
Yes, I think I get the key point. If a VCPU is preempted, then delay()
doesn't return; and if delay() returns, it probably receives timer
interrupt. So split delay() invocations increase the checking times
and reduce failures.

Huacai

>
> Regards
> Bibo Mao
> >
> >
> > Huacai
> >
> >>
> >> Regards
> >> Bibo Mao
> >>>
> >>> Huacai
> >>>
> >>>>
> >>>> Well, let's keep it as is for the present.
> >>>>>
> >>>>> Huacai
> >>>>>
> >>>>>>
> >>>>>> Regards
> >>>>>> Bibo Mao
> >>>>>>>
> >>>>>>>
> >>>>>>> Huacai
> >>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> Regards
> >>>>>>>> Bibo Mao
> >>>>>>>>>
> >>>>>>>>> Huacai
> >>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>> Huacai
> >>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>> Regards
> >>>>>>>>>>>> Bibo Mao
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Huacai
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Regards
> >>>>>>>>>>>>>> Bibo Mao
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Huacai
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> Regards
> >>>>>>>>>>>>>>>> Bibo Mao
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> Huacai
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> Regards
> >>>>>>>>>>>>>>>>>> Bibo Mao
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> Hucai
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>> ---
> >>>>>>>>>>>>>>>>>>>> v2 ... v3:
> >>>>>>>>>>>>>>>>>>>>             1. Adjust order about patch 2 and patch 3
> >>>>>>>>>>>>>>>>>>>>             2. Add test case with alphabetical order
> >>>>>>>>>>>>>>>>>>>>             3. Merge one-shot and period timer interrupt test case into one
> >>>>>>>>>>>>>>>>>>>>             4. Only add LoongArch specific modification with common file
> >>>>>>>>>>>>>>>>>>>>                Makefile.kvm
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>> v1 ... v2:
> >>>>>>>>>>>>>>>>>>>>             1. Restore PC and PRMD after exception handler
> >>>>>>>>>>>>>>>>>>>>             2. Split patch 4 into two small patches with period timer test and
> >>>>>>>>>>>>>>>>>>>>                time counter test
> >>>>>>>>>>>>>>>>>>>>             3. With time counter test, set time count with 0 when create VM. And
> >>>>>>>>>>>>>>>>>>>>                verify time count starts from 0 in guest code
> >>>>>>>>>>>>>>>>>>>> ---
> >>>>>>>>>>>>>>>>>>>> Bibo Mao (6):
> >>>>>>>>>>>>>>>>>>>>             KVM: LoongArch: selftests: Add system registers save and restore on
> >>>>>>>>>>>>>>>>>>>>               exception
> >>>>>>>>>>>>>>>>>>>>             KVM: LoongArch: selftests: Add basic interfaces
> >>>>>>>>>>>>>>>>>>>>             KVM: LoongArch: selftests: Add exception handler register interface
> >>>>>>>>>>>>>>>>>>>>             KVM: LoongArch: selftests: Add timer interrupt test case
> >>>>>>>>>>>>>>>>>>>>             KVM: LoongArch: selftests: Add SW emulated timer test
> >>>>>>>>>>>>>>>>>>>>             KVM: LoongArch: selftests: Add time counter test
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>            tools/testing/selftests/kvm/Makefile.kvm      |   1 +
> >>>>>>>>>>>>>>>>>>>>            .../kvm/include/loongarch/arch_timer.h        |  84 ++++++++
> >>>>>>>>>>>>>>>>>>>>            .../kvm/include/loongarch/processor.h         |  81 +++++++-
> >>>>>>>>>>>>>>>>>>>>            .../selftests/kvm/lib/loongarch/exception.S   |   6 +
> >>>>>>>>>>>>>>>>>>>>            .../selftests/kvm/lib/loongarch/processor.c   |  47 ++++-
> >>>>>>>>>>>>>>>>>>>>            .../selftests/kvm/loongarch/arch_timer.c      | 194 ++++++++++++++++++
> >>>>>>>>>>>>>>>>>>>>            6 files changed, 410 insertions(+), 3 deletions(-)
> >>>>>>>>>>>>>>>>>>>>            create mode 100644 tools/testing/selftests/kvm/include/loongarch/arch_timer.h
> >>>>>>>>>>>>>>>>>>>>            create mode 100644 tools/testing/selftests/kvm/loongarch/arch_timer.c
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>> base-commit: 23cb64fb76257309e396ea4cec8396d4a1dbae68
> >>>>>>>>>>>>>>>>>>>> --
> >>>>>>>>>>>>>>>>>>>> 2.39.3
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>
> >>>>
> >>>>
> >>
> >>
>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ