linux-kernel - Re: [RFC PATCH V2 2/7] perf: Init/fini PMU specific data

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <e12e347e-505e-2703-537f-c4394bebce89@linux.intel.com>
Date:   Mon, 6 Jan 2020 11:18:24 -0500
From:   "Liang, Kan" <kan.liang@...ux.intel.com>
To:     Peter Zijlstra <peterz@...radead.org>,
        Andi Kleen <ak@...ux.intel.com>
Cc:     mingo@...hat.com, acme@...nel.org, tglx@...utronix.de,
        bp@...en8.de, linux-kernel@...r.kernel.org, eranian@...gle.com,
        alexey.budankov@...ux.intel.com, vitaly.slobodskoy@...el.com
Subject: Re: [RFC PATCH V2 2/7] perf: Init/fini PMU specific data



On 1/6/2020 9:31 AM, Peter Zijlstra wrote:
> On Mon, Jan 06, 2020 at 06:23:43AM -0800, Andi Kleen wrote:
>>>> +	rcu_read_lock();
>>>> +	for_each_process_thread(g, p) {
>>>> +		mutex_lock(&p->perf_event_mutex);
>>>> +		if (p->perf_ctx_data) {
>>>> +			/*
>>>> +			 * The perf_ctx_data for this thread may has been
>>>> +			 * allocated by per-task event.
>>>> +			 * Only update refcount for the case.
>>>> +			 */
>>>> +			refcount_inc(&p->perf_ctx_data->refcount);
>>>> +			mutex_unlock(&p->perf_event_mutex);
>>>> +			continue;
>>>> +		}
>>>> +
>>>> +		if (pos < num_thread) {
>>>> +			refcount_set(&data[pos]->refcount, 1);
>>>> +			rcu_assign_pointer(p->perf_ctx_data, data[pos++]);
>>>> +		} else {
>>>> +			/*
>>>> +			 * There may be some new threads created,
>>>> +			 * when we allocate space.
>>>> +			 * Track the number in nr_new_tasks.
>>>> +			 */
>>>> +			nr_new_tasks++;
>>>> +		}
>>>> +		mutex_unlock(&p->perf_event_mutex);
>>>> +	}
>>>> +	rcu_read_unlock();
>>>> +
>>>> +	raw_spin_unlock_irqrestore(&task_data_sys_wide_events_lock, flags);
>>>
>>> Still NAK. That's some mightly broken code there.
>>
>> Yes, Kan you cannot use a mutex (sleeping) inside rcu_read_lock().
>> Can perf_event_mutex be a spin lock?
> 
> Or insize that raw_spin_lock.

The task_data_sys_wide_events_lock is a global lock. I think we just 
need per-task lock here.

I think I will add a new dedicate per-task raw_spin_lock for 
perf_ctx_data here.


> And last time I expressly said to not do
> what whole tasklist iteration under a spinlock.
> 

We need an indicator to tell if the assignment for all existing tasks 
has been finished. Because we have to wait before processing the cases 
as below.
- Allocate the space for new threads. If we don't wait the assignments 
finished, we cannot tell if the perf_ctx_data is allocated by previous 
per-task event, or this system-wide event. The refcount may double count.
- There may be two or more system-wide events at the same time. When we 
are allocating the space for the first event, the second one may start 
profiling if we don't wait. The LBR shorten issue still exists.
- We have to serialize assignment and free.

If we cannot use spinlock to serialize the cases here, can we set a 
state when the assignment is finished, and use wait_var_event() in the 
cases as above?


Thanks,
Kan