[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CANpmjNMR4BgfCxL9qXn0sQrJtQJbEPKxJ5_HEa2VXWi6UY4wig@mail.gmail.com>
Date: Fri, 10 Apr 2020 11:47:23 +0200
From: Marco Elver <elver@...gle.com>
To: Qian Cai <cai@....pw>
Cc: Paolo Bonzini <pbonzini@...hat.com>,
"paul E. McKenney" <paulmck@...nel.org>,
kasan-dev <kasan-dev@...glegroups.com>,
LKML <linux-kernel@...r.kernel.org>, kvm@...r.kernel.org
Subject: Re: KCSAN + KVM = host reset
On Fri, 10 Apr 2020 at 01:00, Qian Cai <cai@....pw> wrote:
>
>
>
> > On Apr 9, 2020, at 5:28 PM, Qian Cai <cai@....pw> wrote:
> >
> >
> >
> >> On Apr 9, 2020, at 12:03 PM, Marco Elver <elver@...gle.com> wrote:
> >>
> >> On Thu, 9 Apr 2020 at 17:30, Qian Cai <cai@....pw> wrote:
> >>>
> >>>
> >>>
> >>>> On Apr 9, 2020, at 11:22 AM, Marco Elver <elver@...gle.com> wrote:
> >>>>
> >>>> On Thu, 9 Apr 2020 at 17:10, Qian Cai <cai@....pw> wrote:
> >>>>>
> >>>>>
> >>>>>
> >>>>>> On Apr 9, 2020, at 3:03 AM, Marco Elver <elver@...gle.com> wrote:
> >>>>>>
> >>>>>> On Wed, 8 Apr 2020 at 23:29, Qian Cai <cai@....pw> wrote:
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>> On Apr 8, 2020, at 5:25 PM, Paolo Bonzini <pbonzini@...hat.com> wrote:
> >>>>>>>>
> >>>>>>>> On 08/04/20 22:59, Qian Cai wrote:
> >>>>>>>>> Running a simple thing on this AMD host would trigger a reset right away.
> >>>>>>>>> Unselect KCSAN kconfig makes everything work fine (the host would also
> >>>>>>>>> reset If only "echo off > /sys/kernel/debug/kcsan” before running qemu-kvm).
> >>>>>>>>
> >>>>>>>> Is this a regression or something you've just started to play with? (If
> >>>>>>>> anything, the assembly language conversion of the AMD world switch that
> >>>>>>>> is in linux-next could have reduced the likelihood of such a failure,
> >>>>>>>> not increased it).
> >>>>>>>
> >>>>>>> I don’t remember I had tried this combination before, so don’t know if it is a
> >>>>>>> regression or not.
> >>>>>>
> >>>>>> What happens with KASAN? My guess is that, since it also happens with
> >>>>>> "off", something that should not be instrumented is being
> >>>>>> instrumented.
> >>>>>
> >>>>> No, KASAN + KVM works fine.
> >>>>>
> >>>>>>
> >>>>>> What happens if you put a 'KCSAN_SANITIZE := n' into
> >>>>>> arch/x86/kvm/Makefile? Since it's hard for me to reproduce on this
> >>>>>
> >>>>> Yes, that works, but this below alone does not work,
> >>>>>
> >>>>> KCSAN_SANITIZE_kvm-amd.o := n
> >>>>
> >>>> There are some other files as well, that you could try until you hit
> >>>> the right one.
> >>>>
> >>>> But since this is in arch, 'KCSAN_SANITIZE := n' wouldn't be too bad
> >>>> for now. If you can't narrow it down further, do you want to send a
> >>>> patch?
> >>>
> >>> No, that would be pretty bad because it will disable KCSAN for Intel
> >>> KVM as well which is working perfectly fine right now. It is only AMD
> >>> is broken.
> >>
> >> Interesting. Unfortunately I don't have access to an AMD machine right now.
> >>
> >> Actually I think it should be:
> >>
> >> KCSAN_SANITIZE_svm.o := n
> >> KCSAN_SANITIZE_pmu_amd.o := n
> >>
> >> If you want to disable KCSAN for kvm-amd.
> >
> > KCSAN_SANITIZE_svm.o := n
> >
> > That alone works fine. I am wondering which functions there could trigger
> > perhaps some kind of recursing with KCSAN?
>
> Another data point is set CONFIG_KCSAN_INTERRUPT_WATCHER=n alone
> also fixed the issue. I saw quite a few interrupt related function in svm.c, so
> some interrupt-related recursion going on?
That would contradict what you said about it working if KCSAN is
"off". What kernel are you attempting to use in the VM?
Thanks,
-- Marco
Powered by blists - more mailing lists