lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87o8dxf597.ffs@nanos.tec.linutronix.de>
Date:   Thu, 29 Apr 2021 18:02:44 +0200
From:   Thomas Gleixner <tglx@...utronix.de>
To:     Zelin Deng <zelin.deng@...ux.alibaba.com>,
        Paolo Bonzini <pbonzini@...hat.com>,
        Sean Christopherson <seanjc@...gle.com>,
        Wanpeng Li <wanpengli@...cent.com>
Cc:     linux-kernel@...r.kernel.org, kvm@...r.kernel.org, x86@...nel.org
Subject: Re: [PATCH] Guest system time jumps when new vCPUs is hot-added

On Thu, Apr 29 2021 at 17:38, Zelin Deng wrote:
> On 2021/4/29 下午4:46, Thomas Gleixner wrote:
>> And that validation expects that the CPUs involved run in a tight loop
>> concurrently so the TSC readouts which happen on both can be reliably
>> compared.
>>
>> But this cannot be guaranteed on vCPUs at all, because the host can
>> schedule out one or both at any point during that synchronization
>> check.
>
> Is there any plan to fix this?

The above cannot be fixed.

As I said before the solution is:

>> A two socket guest setup needs to have information from the host that
>> TSC is usable and that the socket sync check can be skipped. Anything
>> else is just doomed to fail in hard to diagnose ways.
>
> Yes, I had tried to add "tsc=unstable" to skip tsc sync.  However if a 

tsc=unstable? Oh well.

> user process which is not pined to vCPU is using rdtsc, it can get tsc 
> warp, because it can be scheduled among vCPUs.  Does it mean user

Only if the hypervisor is not doing the right thing and makes sure that
all vCPUs have the same tsc offset vs. the host TSC.

> applications have to guarantee itself to use rdtsc only when TSC is 
> reliable?

If the TSCs of CPUs are not in sync then the kernel does the right thing
and uses some other clocksource for the various time interfaces, e.g.
the kernel provides clock_getttime() which guarantees to be correct
whether TSC is usable or not.

Any application using RDTSC directly is own their own and it's not a
kernel problem.

The host kernel cannot make guarantees that the hardware is sane neither
can a guest kernel make guarantees that the hypervisor is sane.

Thanks,

        tglx




Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ