linux-kernel - Re: [PATCH] ACPI: PCI: Fix device reference counting in acpi_get_pci

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <Y0/sGveKPjuUWOhO@intel.com>
Date:   Wed, 19 Oct 2022 15:22:50 +0300
From:   Ville Syrjälä <ville.syrjala@...ux.intel.com>
To:     "Rafael J. Wysocki" <rafael@...nel.org>
Cc:     "Rafael J. Wysocki" <rjw@...ysocki.net>,
        Linux PCI <linux-pci@...r.kernel.org>,
        Linux ACPI <linux-acpi@...r.kernel.org>,
        LKML <linux-kernel@...r.kernel.org>,
        Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
        Bjorn Helgaas <helgaas@...nel.org>,
        dri-devel@...ts.freedesktop.org, intel-gfx@...ts.freedesktop.org
Subject: Re: [PATCH] ACPI: PCI: Fix device reference counting in
 acpi_get_pci_dev()

On Wed, Oct 19, 2022 at 01:35:26PM +0200, Rafael J. Wysocki wrote:
> On Wed, Oct 19, 2022 at 11:02 AM Ville Syrjälä
> <ville.syrjala@...ux.intel.com> wrote:
> >
> > On Tue, Oct 18, 2022 at 07:34:03PM +0200, Rafael J. Wysocki wrote:
> > > From: Rafael J. Wysocki <rafael.j.wysocki@...el.com>
> > >
> > > Commit 63f534b8bad9 ("ACPI: PCI: Rework acpi_get_pci_dev()") failed
> > > to reference count the device returned by acpi_get_pci_dev() as
> > > expected by its callers which in some cases may cause device objects
> > > to be dropped prematurely.
> > >
> > > Add the missing get_device() to acpi_get_pci_dev().
> > >
> > > Fixes: 63f534b8bad9 ("ACPI: PCI: Rework acpi_get_pci_dev()")
> >
> > FYI this (and the rtc-cmos regression discussed in
> > https://lore.kernel.org/linux-acpi/5887691.lOV4Wx5bFT@kreacher/)
> > took down the entire Intel gfx CI.
> 
> Sorry for the disturbance.
> 
> > I've applied both fixes into our fixup branch and things are looking much
> > healthier now.
> 
> Thanks for letting me know.
> 
> I've just added the $subject patch to my linux-next branch as an
> urgent fix and the other one has been applied to the RTC tree.
> 
> > This one caused i915 selftests to eat a lot of POISON_FREE
> > in the CI. While bisecting it locally I didn't have
> > poisoning enabled so I got refcount_t undeflows instead.
> 
> Unfortunately, making no mistakes is generally hard to offer.
> 
> If catching things like this early is better, what about pulling my
> bleeding-edge branch, where all of my changes are staged before going
> into linux-next, into the CI?

Pretty sure we don't have the resources to become the CI for
everyone. So testing random trees is not really possible. And 
the alternative of pulling random trees into drm-tip is probably
a not a popular idea either. We used to pull in the sound tree
since it's pretty closely tied to graphics, but I think we
stopped even that because it eneded up pulling the whole of
-rc1 in at random points in time when we were't expecting it.

Ideally each subsystem would have its own CI, or there should
be some kernel wide thing. But I suppose the progress towards
something like that is glacial.

That said, we do test linux-next to some degree. And looks like
at least one of these could have been caught a bit earlier through
that. Unfortunately no one is really keeping an eye on that so
things tend to slip through. Probably need to figure out something
to make better use of that.

> 
> > https://intel-gfx-ci.01.org/tree/drm-tip/index.html has a lot
> > of colorful boxes to click if you're interested in any of the
> > logs. The fixes are included in the CI_DRM_12259 build. Earlier
> > builds were broken.
> 
> Thanks!

-- 
Ville Syrjälä
Intel