[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <e12e347e-505e-2703-537f-c4394bebce89@linux.intel.com>
Date: Mon, 6 Jan 2020 11:18:24 -0500
From: "Liang, Kan" <kan.liang@...ux.intel.com>
To: Peter Zijlstra <peterz@...radead.org>,
Andi Kleen <ak@...ux.intel.com>
Cc: mingo@...hat.com, acme@...nel.org, tglx@...utronix.de,
bp@...en8.de, linux-kernel@...r.kernel.org, eranian@...gle.com,
alexey.budankov@...ux.intel.com, vitaly.slobodskoy@...el.com
Subject: Re: [RFC PATCH V2 2/7] perf: Init/fini PMU specific data
On 1/6/2020 9:31 AM, Peter Zijlstra wrote:
> On Mon, Jan 06, 2020 at 06:23:43AM -0800, Andi Kleen wrote:
>>>> + rcu_read_lock();
>>>> + for_each_process_thread(g, p) {
>>>> + mutex_lock(&p->perf_event_mutex);
>>>> + if (p->perf_ctx_data) {
>>>> + /*
>>>> + * The perf_ctx_data for this thread may has been
>>>> + * allocated by per-task event.
>>>> + * Only update refcount for the case.
>>>> + */
>>>> + refcount_inc(&p->perf_ctx_data->refcount);
>>>> + mutex_unlock(&p->perf_event_mutex);
>>>> + continue;
>>>> + }
>>>> +
>>>> + if (pos < num_thread) {
>>>> + refcount_set(&data[pos]->refcount, 1);
>>>> + rcu_assign_pointer(p->perf_ctx_data, data[pos++]);
>>>> + } else {
>>>> + /*
>>>> + * There may be some new threads created,
>>>> + * when we allocate space.
>>>> + * Track the number in nr_new_tasks.
>>>> + */
>>>> + nr_new_tasks++;
>>>> + }
>>>> + mutex_unlock(&p->perf_event_mutex);
>>>> + }
>>>> + rcu_read_unlock();
>>>> +
>>>> + raw_spin_unlock_irqrestore(&task_data_sys_wide_events_lock, flags);
>>>
>>> Still NAK. That's some mightly broken code there.
>>
>> Yes, Kan you cannot use a mutex (sleeping) inside rcu_read_lock().
>> Can perf_event_mutex be a spin lock?
>
> Or insize that raw_spin_lock.
The task_data_sys_wide_events_lock is a global lock. I think we just
need per-task lock here.
I think I will add a new dedicate per-task raw_spin_lock for
perf_ctx_data here.
> And last time I expressly said to not do
> what whole tasklist iteration under a spinlock.
>
We need an indicator to tell if the assignment for all existing tasks
has been finished. Because we have to wait before processing the cases
as below.
- Allocate the space for new threads. If we don't wait the assignments
finished, we cannot tell if the perf_ctx_data is allocated by previous
per-task event, or this system-wide event. The refcount may double count.
- There may be two or more system-wide events at the same time. When we
are allocating the space for the first event, the second one may start
profiling if we don't wait. The LBR shorten issue still exists.
- We have to serialize assignment and free.
If we cannot use spinlock to serialize the cases here, can we set a
state when the assignment is finished, and use wait_var_event() in the
cases as above?
Thanks,
Kan
Powered by blists - more mailing lists