lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20160317151007.GF20310@potion.brq.redhat.com>
Date:	Thu, 17 Mar 2016 16:10:07 +0100
From:	Radim Krcmar <rkrcmar@...hat.com>
To:	Andy Lutomirski <luto@...capital.net>
Cc:	Andy Lutomirski <luto@...nel.org>, X86 ML <x86@...nel.org>,
	Marcelo Tosatti <mtosatti@...hat.com>,
	Paolo Bonzini <pbonzini@...hat.com>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	kvm list <kvm@...r.kernel.org>, Alexander Graf <agraf@...e.de>
Subject: Re: [PATCH 1/5] x86/kvm: On KVM re-enable (e.g. after suspend),
 update clocks

2016-03-16 16:07-0700, Andy Lutomirski:
> On Wed, Mar 16, 2016 at 3:59 PM, Radim Krcmar <rkrcmar@...hat.com> wrote:
>> 2016-03-16 15:15-0700, Andy Lutomirski:
>>> FWIW, if you ever intend to support ART ("always running timer")
>>> passthrough, this is going to be a giant clusterfsck.  Good luck.  I
>>> haven't gotten a straight answer as to what hardware actually supports
>>> that thing, so even testing isn't no easy.
>>
>> Hm, AR TSC would be best handled by doing nothing ... dropping the
>> faking logic just became tempting.

ART is different from what I initially thought, it's the underlying
mechanism for invariant TSC and nothing more ...  we already forbid
migrations when the guest knows about invariant TSC, so we could do the
same and let ART be virtualized.  (Suspend has to be forbidden too.)

> As it stands, ART is screwed if you adjust the VMCS's tsc offset.  But

Luckily, assigning real hardware can prevent migration or suspend, so we
won't need to adjust the offset during runtime.  TSC is a generally
unmigratable device that just happens to live on the CPU.

(It would have been better to hide TSC capability from the guest and only
 use rdtsc for kvmclock if the guest wanted fancy features.)

> I think it's also screwed if you migrate to a machine with a different
> ratio of guest TSC ticks to host ART ticks or a different offset,
> because the host isn't going to do the rdmsr every time it tries to
> access the ART, so passing it through might require a paravirt
> mechanism no matter what.

It's almost certain that the other host will have a different offset,
which makes TSC unmigratable in software without even considering ART
or frequencies.  Well, KVM already emulates different TSC frequency, so
we could emulate ART without sinking much lower. :)

> ISTM that, if KVM tries to keep the guest TSC monotonic across
> migration, it should probably also keep it monotonic across host
> suspend/resume.

Yes, "Pausing" TSC during suspend or migration is one way of improving
the TSC estimate.  If we want to emulate ART, then the estimate is
noticeably lacking, because TSC and ART are defined by a simple
equation (SDM 2015-12, 17.14.4 Invariant Time-Keeping):
 TSC_Value = (ART_Value * CPUID.15H:EBX[31:0] )/ CPUID.15H:EAX[31:0] + K

where the guest thinks that CPUID and K are constant (between events
that the guest knows of), so we should give the best estimate of how
many TSC cycles have passed.  (The best estimate is still lacking.)

>                  After all, host suspend/resume is kind of like
> migrating from the pre-suspend host to the post-resume host.  Maybe it
> could even share code.

Hopefully ... host suspend/resume is driven by kernel and migration is
driven by userspace, which might complicate sharing.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ