[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <jrjfdolkg5rfw2xtngqt2bc6ugssxralaqw4q6gyijlkuhyecm@kvyb3fkwngrr>
Date: Mon, 14 Oct 2024 15:12:36 -0500
From: Lucas De Marchi <lucas.demarchi@...el.com>
To: Peter Zijlstra <peterz@...radead.org>
CC: <linux-kernel@...r.kernel.org>, <dri-devel@...ts.freedesktop.org>, "Ingo
Molnar" <mingo@...hat.com>, Arnaldo Carvalho de Melo <acme@...nel.org>,
"Umesh Nerlige Ramappa" <umesh.nerlige.ramappa@...el.com>, Ian Rogers
<irogers@...gle.com>, Tvrtko Ursulin <tvrtko.ursulin@...lia.com>
Subject: Re: [PATCH 3/5] perf: Add pmu get/put
On Mon, Oct 14, 2024 at 09:25:19PM +0200, Peter Zijlstra wrote:
>On Mon, Oct 14, 2024 at 01:20:34PM -0500, Lucas De Marchi wrote:
>> On Mon, Oct 14, 2024 at 07:32:46PM +0200, Peter Zijlstra wrote:
>
>> > I'm confused.. probably because I still don't have any clue about
>> > drivers and the above isn't really telling me much either.
>> >
>> > I don't see how you get rid of the try_module_get() we do per event;
>> > without that you can't unload the module.
>>
>> I don't get rid of the try_module_get(). They serve diffeerent purposes.
>> Having a reference to the module prevents the _module_ going away (and
>> hence the function pointers we call into from perf). It doesn't prevent
>> the module unbinding from the HW. A module may have N instances if it's
>> bound to N devices.
>>
>> This can be done today to unbind the HW (integrated graphics) from the
>> i915 module:
>>
>> # echo 0000:00:02.0 > /sys/bus/pci/drivers/i915/unbind
>>
>> The ref taken by these new get()/put() are related to preventing the
>> data going away - the driver can use that to take a ref on something
>> that will survive the unbind.
>
>OK, for some reason I thought to remember that you wanted to be able to
>unload the module too.
humn... maybe because in the first version we were talking about
shortcutting all the function calls. If made by the driver, it'd allow
to remove the ugly `if (pmu->closed)`, and if made by perf itself, it
would allow to drop the module ref since we would guarantee we are not
calling into the module anymore.
But I think that's orthogonal to what we really care about: once the HW
vanishes underneath us, stop accessing it and unregister the PMU in a
way that if the HW shows up again we can still register it and nothing
explodes.
>
>> > And I don't see how you think it is safe to free a pmu while there are
>> > still events around.
>>
>> so, we don't actually free it - the pmu is unregistered but the
>> `struct pmu` and (possibly) its container are still around after unregister.
>> When the get/put are used, the driver can keep the data around, which is
>> then free'd when the last reference is put.
>
>Aaaaah, okay. So the implementation knows to nop out all device
>interaction when it gets unbound, but the events and pmu data stick
>around until they're naturally killed off?
>
>Ah, reading the below patches that is indeed what i915 does. pmu->closed
>makes this so.
yes. Without these patches it doesn't work well though, particuarly
because
a) even if we protected the methods with pmu->closed(), the data is
freed when we call perf_pmu_unregister(), making the whole pmu pointer
invalid
b) kernel/core/events.c still accesss the pmu after calling
event->destroy() - we can't really hook on that to destroy the pmu like
is done today.
>
>The dummy thing you posted in this thread, does perf_event_disable() on
that's optional and I think we could live without. The main issue is
completely crashing the kernel if we do:
# perf stat -e i915/rc6-residency/ -I1000 &
# echo 0000:00:02.0 > /sys/bus/pci/drivers/i915/unbind
... which these patches are fixing regardless of calling
perf_event_disable().
>all previously created events, and this is not sound. Userspace can do
>PERF_EVENT_IOC_ENABLE on them and then things will go side-ways fast.
>And I was afraid i915 was doing this same.
For the i915 series, that would be "[PATCH 8/8] drm/i915/pmu: Release
open events when unregistering". See release_active_events()
I was just wanting a smoke signal/hint for userspace that "something
went wrong" with this counter.
>
>> - Subject: [PATCH 3/8] drm/i915/pmu: Fix crash due to use-after-free
>
>So reading that Changelog, you would like a replacement for pmu->closed
>as well.
>
>I suppose, one way to go about doing this is to have
>perf_pmu_unregister() replace a bunch of methods. Notably you have
>pmu->closed in:
>
> - event_init()
> - read()
> - start()
> - stop()
> - add()
>
>Having perf_pmu_unregister() overwrite those function pointers with
>something akin to your pmu->closed would go a long way, right? It would
it would help if we want to unload the module, to make it work for
other drivers without having to add similar flag and to make those
drivers less ugly with those checks. However grepping the
kernel for calls to perf_pmu_register(), it seems there are only 3
candidates, all in drm: i915, amdgpu and xe (the xe is a pending patch
series waiting on the resolution of this issue). There is
drivers/powercap/intel_rapl_common.c with its `bool registered` flag,
but that is basically doing multiple register/unregister to replace the pmu
rather than working with HW that can be removed.
>require using READ_ONCE() for calling the methods, which would make
>things a little ugly :/
>
>But I also think we want to force all the events into STATE_ERROR, and
yeah... When I was looking for the smoke signal I mentioned, I wanted
something to move the event into that state, but couldn't find one. The
best I found was to disable it.
>I'm not immediately sure how best to go about doing that. Adding better
>return value handling to ->add() is trivial enough, and perhaps calling
>perf_pmu_resched() is sufficient to cycle everything.
>
>Let me ponder that a little bit.
thanks for the help
Lucas De Marchi
>
Powered by blists - more mailing lists