[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20240816180441.81f4d694-3b-amachhiw@linux.ibm.com>
Date: Fri, 16 Aug 2024 18:13:40 +0530
From: Amit Machhiwal <amachhiw@...ux.ibm.com>
To: Michael Ellerman <mpe@...erman.id.au>
Cc: Bjorn Helgaas <helgaas@...nel.org>, Rob Herring <robh@...nel.org>,
        linux-pci@...r.kernel.org, linux-kernel@...r.kernel.org,
        devicetree@...r.kernel.org, linuxppc-dev@...ts.ozlabs.org,
        kvm-ppc@...r.kernel.org, Bjorn Helgaas <bhelgaas@...gle.com>,
        Lizhi Hou <lizhi.hou@....com>, Saravana Kannan <saravanak@...gle.com>,
        Vaibhav Jain <vaibhav@...ux.ibm.com>,
        Nicholas Piggin <npiggin@...il.com>,
        Vaidyanathan Srinivasan <svaidy@...ux.ibm.com>,
        Kowshik Jois B S <kowsjois@...ux.ibm.com>,
        Lukas Wunner <lukas@...ner.de>, kernel-team@...ts.ubuntu.com,
        Stefan Bader <stefan.bader@...onical.com>
Subject: Re: [PATCH v3] PCI: Fix crash during pci_dev hot-unplug on pseries
 KVM guest
Hi Michael,
On 2024/08/15 01:20 PM, Michael Ellerman wrote:
> Bjorn Helgaas <helgaas@...nel.org> writes:
> > On Sat, Aug 03, 2024 at 12:03:25AM +0530, Amit Machhiwal wrote:
> >> With CONFIG_PCI_DYNAMIC_OF_NODES [1], a hot-plug and hot-unplug sequence
> >> of a PCI device attached to a PCI-bridge causes following kernel Oops on
> >> a pseries KVM guest:
> >
> > What is unique about pseries here?  There's nothing specific to
> > pseries in the patch, so I would expect this to be a generic problem
> > on any arch.
> >
> >>  RTAS: event: 2, Type: Hotplug Event (229), Severity: 1
> >>  Kernel attempted to read user page (10ec00000048) - exploit attempt? (uid: 0)
> >>  BUG: Unable to handle kernel data access on read at 0x10ec00000048
> >
> > Weird address.  I would expect NULL or something.  Where did this
> > non-NULL pointer come from?
> 
> It originally comes from np->data, which is supposed to be an
> of_changeset.
> 
> The powerpc code also uses np->data for the struct pci_dn pointer, see
> pci_add_device_node_info().
> 
> I wonder if that's why it's non-NULL?
I'm also looking into the code to figure out where's that value coming from. I
will update as soon as I get there.
> 
> Amit, do we have exact steps to reproduce this? I poked around a bit but
> couldn't get it to trigger.
Sure, below are the steps:
1. Set CONFIG_PCI_DYNAMIC_OF_NODES=y in the kernel config and compile (Fedora
   has it disabled in it's distro config, Ubuntu has it enabled but will have it
   disabled in the next update)
2. If you are using Fedora cloud images, make sure you've these packages
   installed:
    $ rpm -qa | grep -e 'ppc64-diag\|powerpc-utils'
    powerpc-utils-core-1.3.11-6.fc40.ppc64le
    powerpc-utils-1.3.11-6.fc40.ppc64le
    ppc64-diag-rtas-2.7.9-6.fc40.ppc64le
    ppc64-diag-2.7.9-6.fc40.ppc64le
3. Hotplug a pci device as follows:
    virsh attach-interface <domain_name> bridge --source virbr0
4. Check if the pci device was added by running `ip a s`
5. Try hot-unplug of that device by supplying the MAC, which should trigger the
   Oops
    virsh detach-interface <domain_name> bridge <mac_addr>
Thanks,
Amit
> cheers
Powered by blists - more mailing lists
 
