[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAJZ5v0gdcVSOPkfs0yzfubpU9EXu02n2u8Pau7sE=yrZ-mvDEQ@mail.gmail.com>
Date: Wed, 19 Oct 2022 15:00:01 +0200
From: "Rafael J. Wysocki" <rafael@...nel.org>
To: Ville Syrjälä <ville.syrjala@...ux.intel.com>
Cc: "Rafael J. Wysocki" <rafael@...nel.org>,
"Rafael J. Wysocki" <rjw@...ysocki.net>,
Linux PCI <linux-pci@...r.kernel.org>,
Linux ACPI <linux-acpi@...r.kernel.org>,
LKML <linux-kernel@...r.kernel.org>,
Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
Bjorn Helgaas <helgaas@...nel.org>,
dri-devel@...ts.freedesktop.org, intel-gfx@...ts.freedesktop.org
Subject: Re: [PATCH] ACPI: PCI: Fix device reference counting in acpi_get_pci_dev()
On Wed, Oct 19, 2022 at 2:22 PM Ville Syrjälä
<ville.syrjala@...ux.intel.com> wrote:
>
> On Wed, Oct 19, 2022 at 01:35:26PM +0200, Rafael J. Wysocki wrote:
> > On Wed, Oct 19, 2022 at 11:02 AM Ville Syrjälä
> > <ville.syrjala@...ux.intel.com> wrote:
> > >
> > > On Tue, Oct 18, 2022 at 07:34:03PM +0200, Rafael J. Wysocki wrote:
> > > > From: Rafael J. Wysocki <rafael.j.wysocki@...el.com>
> > > >
> > > > Commit 63f534b8bad9 ("ACPI: PCI: Rework acpi_get_pci_dev()") failed
> > > > to reference count the device returned by acpi_get_pci_dev() as
> > > > expected by its callers which in some cases may cause device objects
> > > > to be dropped prematurely.
> > > >
> > > > Add the missing get_device() to acpi_get_pci_dev().
> > > >
> > > > Fixes: 63f534b8bad9 ("ACPI: PCI: Rework acpi_get_pci_dev()")
> > >
> > > FYI this (and the rtc-cmos regression discussed in
> > > https://lore.kernel.org/linux-acpi/5887691.lOV4Wx5bFT@kreacher/)
> > > took down the entire Intel gfx CI.
> >
> > Sorry for the disturbance.
> >
> > > I've applied both fixes into our fixup branch and things are looking much
> > > healthier now.
> >
> > Thanks for letting me know.
> >
> > I've just added the $subject patch to my linux-next branch as an
> > urgent fix and the other one has been applied to the RTC tree.
> >
> > > This one caused i915 selftests to eat a lot of POISON_FREE
> > > in the CI. While bisecting it locally I didn't have
> > > poisoning enabled so I got refcount_t undeflows instead.
> >
> > Unfortunately, making no mistakes is generally hard to offer.
> >
> > If catching things like this early is better, what about pulling my
> > bleeding-edge branch, where all of my changes are staged before going
> > into linux-next, into the CI?
>
> Pretty sure we don't have the resources to become the CI for
> everyone. So testing random trees is not really possible. And
> the alternative of pulling random trees into drm-tip is probably
> a not a popular idea either. We used to pull in the sound tree
> since it's pretty closely tied to graphics, but I think we
> stopped even that because it eneded up pulling the whole of
> -rc1 in at random points in time when we were't expecting it.
I see.
> Ideally each subsystem would have its own CI, or there should
> be some kernel wide thing. But I suppose the progress towards
> something like that is glacial.
Well, I definitely cannot afford a dedicated CI just for my tree and I
haven't got any useful information from KernlCI yet (even though hey
pull and test my linux-next branch on a regular basis).
KernelCI seems to be focusing on different set of hardware, so to speak.
> That said, we do test linux-next to some degree. And looks like
> at least one of these could have been caught a bit earlier through
> that. Unfortunately no one is really keeping an eye on that so
> things tend to slip through. Probably need to figure out something
> to make better use of that.
I think it could also be possible to contribute to KernelCI to get
more useful x86 coverage from it.
Powered by blists - more mailing lists