lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20190905121638.GD4320@e113682-lin.lund.arm.com>
Date:   Thu, 5 Sep 2019 14:16:38 +0200
From:   Christoffer Dall <christoffer.dall@....com>
To:     Heinrich Schuchardt <xypron.glpk@....de>
Cc:     Stefan Hajnoczi <stefanha@...hat.com>,
        Daniel P. Berrangé <berrange@...hat.com>,
        Marc Zyngier <marc.zyngier@....com>,
        linux-kernel@...r.kernel.org, kvmarm@...ts.cs.columbia.edu,
        linux-arm-kernel@...ts.infradead.org
Subject: Re: [PATCH 1/1] KVM: inject data abort if instruction cannot be
 decoded

Hi Heinrich,

On Thu, Sep 05, 2019 at 02:01:36PM +0200, Heinrich Schuchardt wrote:
> On 9/5/19 11:20 AM, Stefan Hajnoczi wrote:
> > On Wed, Sep 04, 2019 at 08:07:36PM +0200, Heinrich Schuchardt wrote:
> > > If an application tries to access memory that is not mapped, an error
> > > ENOSYS, "load/store instruction decoding not implemented" may occur.
> > > QEMU will hang with a register dump.
> > > 
> > > Instead create a data abort that can be handled gracefully by the
> > > application running in the virtual environment.
> > > 
> > > Now the virtual machine can react to the event in the most appropriate
> > > way - by recovering, by writing an informative log, or by rebooting.
> > > 
> > > Signed-off-by: Heinrich Schuchardt <xypron.glpk@....de>
> > > ---
> > >   virt/kvm/arm/mmio.c | 4 ++--
> > >   1 file changed, 2 insertions(+), 2 deletions(-)
> > > 
> > > diff --git a/virt/kvm/arm/mmio.c b/virt/kvm/arm/mmio.c
> > > index a8a6a0c883f1..0cbed7d6a0f4 100644
> > > --- a/virt/kvm/arm/mmio.c
> > > +++ b/virt/kvm/arm/mmio.c
> > > @@ -161,8 +161,8 @@ int io_mem_abort(struct kvm_vcpu *vcpu, struct kvm_run *run,
> > >   		if (ret)
> > >   			return ret;
> > >   	} else {
> > > -		kvm_err("load/store instruction decoding not implemented\n");
> > > -		return -ENOSYS;
> > > +		kvm_inject_dabt(vcpu, kvm_vcpu_get_hfar(vcpu));
> > > +		return 1;
> > 
> > I see this more as a temporary debugging hack than something to merge.
> > 
> > It sounds like in your case the guest environment provided good
> > debugging information and you preferred it over debugging this from the
> > host side.  That's fine, but allowing the guest to continue running in
> > the general case makes it much harder to track down the root cause of a
> > problem because many guest CPU instructions may be executed after the
> > original problem occurs.  Other guest software may fail silently in
> > weird ways.  IMO it's best to fail early.
> > 
> > Stefan
> > 
> 
> As virtual machine are ubiquitous, expect also mission critical system
> to run on them. At development time halting a machine may be a good
> idea. In production this is often the worst solution. Rebooting may be
> essential for survival.
> 
> For an anecdotal example see:
> https://www.hq.nasa.gov/alsj/a11/a11.1201-pa.html
> 
> I am convinced that leaving it to the guest to decide how to react is
> the best choice.
> 
Maintaining strong adherence to the architecture is equally important,
and I'm sure we can find anecdotes to support how not doing the
expected, can also lead to disastrous outcomes.

Have you had a look at the suggested patch I sent?  The idea is that we
can preserve existing legacy ABI, allow for a better debugging
experience, allow userspace to do emulation if it so wishes, and provide
a better error message if userspace doesn't handle this properly.

One thing we could change from my proposed patch would be to have KVM
inject the access as an external abort if the target address also
doesn't hit an MMIO device, which is by far the common scenario reported
here on the list.

Hopefully, a mission critical deployment based on KVM/Arm (scary as that
sounds), would use a recent and patched VMM (QEMU) that either causes
the external abort, or reboots the VM, as per the configuration of the
particular system in question.


Thanks,

    Christoffer

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ