[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20150722015045.GA24420@us.ibm.com>
Date: Tue, 21 Jul 2015 18:50:45 -0700
From: Sukadev Bhattiprolu <sukadev@...ux.vnet.ibm.com>
To: Peter Zijlstra <peterz@...radead.org>
Cc: Ingo Molnar <mingo@...hat.com>,
Arnaldo Carvalho de Melo <acme@...nel.org>,
Michael Ellerman <mpe@...erman.id.au>,
linux-kernel@...r.kernel.org, linuxppc-dev@...ts.ozlabs.org,
linux-s390@...r.kernel.org, sparclinux@...r.kernel.org
Subject: Re: [PATCH v3 7/8] perf: Define PMU_TXN_READ interface
Peter Zijlstra [peterz@...radead.org] wrote:
| On Tue, Jul 14, 2015 at 08:01:54PM -0700, Sukadev Bhattiprolu wrote:
| > +/*
| > + * Use the transaction interface to read the group of events in @leader.
| > + * PMUs like the 24x7 counters in Power, can use this to queue the events
| > + * in the ->read() operation and perform the actual read in ->commit_txn.
| > + *
| > + * Other PMUs can ignore the ->start_txn and ->commit_txn and read each
| > + * PMU directly in the ->read() operation.
| > + */
| > +static int perf_event_read_group(struct perf_event *leader)
| > +{
| > + int ret;
| > + struct perf_event *sub;
| > + struct pmu *pmu;
| > +
| > + pmu = leader->pmu;
| > +
| > + pmu->start_txn(pmu, PERF_PMU_TXN_READ);
| > +
| > + perf_event_read(leader);
|
| There should be a lockdep assert with that list iteration.
|
| > + list_for_each_entry(sub, &leader->sibling_list, group_entry)
| > + perf_event_read(sub);
| > +
| > + ret = pmu->commit_txn(pmu);
Peter,
I have a situation :-)
We are trying to use the following interface:
start_txn(pmu, PERF_PMU_TXN_READ);
perf_event_read(leader);
list_for_each(sibling, &leader->sibling_list, group_entry)
perf_event_read(sibling)
pmu->commit_txn(pmu);
with the idea that the PMU driver would save the type of transaction in
->start_txn() and use in ->read() and ->commit_txn().
But since ->start_txn() and the ->read() operations could happen on different
CPUs (perf_event_read() uses the event->oncpu to schedule a call), the PMU
driver cannot use a per-cpu variable to save the state in ->start_txn().
I tried using a pmu-wide global, but that would also need us to hold a mutex
to serialize access to that global. The problem is ->start_txn() can be
called from an interrupt context for the TXN_ADD transactions (I got the
following backtrace during testing)
mutex_lock_nested+0x504/0x520 (unreliable)
h_24x7_event_start_txn+0x3c/0xd0
group_sched_in+0x70/0x230
ctx_sched_in.isra.63+0x150/0x230
__perf_install_in_context+0x1c8/0x1e0
remote_function+0x7c/0xa0
flush_smp_call_function_queue+0xb0/0x1d0
smp_ipi_demux+0x88/0xf0
icp_hv_ipi_action+0x54/0xc0
handle_irq_event_percpu+0x98/0x2b0
handle_percpu_irq+0x7c/0xc0
generic_handle_irq+0x4c/0x80
__do_irq+0x7c/0x190
call_do_irq+0x14/0x24
do_IRQ+0x8c/0x100
hardware_interrupt_common+0x168/0x180
--- interrupt: 501 at .plpar_hcall_norets+0x14/0x20
Basically stuck trying to save the txn type in ->start_txn() and retrieve in
->read().
Couple of options I can think of are:
- having ->start_txn() return a handle that should then be passed in
with ->read() (yuck) and ->commit_txn().
- serialize the READ transaction for the PMU in perf_event_read_group()
with a new pmu->txn_mutex:
mutex_lock(&pmu->txn_mutex);
pmu->start_txn()
list_for_each_entry(sub, &leader->sibling_list, group_entry)
perf_event_read(sub);
ret = pmu->commit_txn(pmu);
mutex_unlock(&pmu->txn_mutex);
such serialization would be ok with 24x7 counters (they are system
wide counters anyway) We could maybe skip the mutex for PMUs that
don't implement TXN_READ interface.
or is there better way?
Sukadev
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists