linux-kernel - Re: [RFC PATCH V3 2/7] perf: attach/detach PMU specific data

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <78364595-78f5-f9da-1b45-a94f49f81996@linux.intel.com>
Date:   Wed, 8 Jan 2020 14:52:10 -0500
From:   "Liang, Kan" <kan.liang@...ux.intel.com>
To:     Andi Kleen <ak@...ux.intel.com>
Cc:     peterz@...radead.org, mingo@...hat.com, acme@...nel.org,
        tglx@...utronix.de, bp@...en8.de, linux-kernel@...r.kernel.org,
        eranian@...gle.com, alexey.budankov@...ux.intel.com,
        vitaly.slobodskoy@...el.com
Subject: Re: [RFC PATCH V3 2/7] perf: attach/detach PMU specific data



On 1/8/2020 11:50 AM, Andi Kleen wrote:
>> +static int
>> +attach_system_wide_ctx_data(size_t ctx_size)
>> +{
>> +	int i, num_thread, pos, nr_failed_alloc;
>> +	unsigned long flags = GFP_ATOMIC;
>> +	struct perf_ctx_data *tsk_data;
>> +	struct perf_ctx_data **data;
>> +	struct task_struct *g, *p;
>> +	bool re_alloc = true;
>> +
>> +	/* Retrieve total number of threads */
>> +	num_thread = nr_threads;
>> +
>> +	data = kcalloc(num_thread, sizeof(*data), GFP_KERNEL);
> 
> This probably needs kvcalloc for reliability and avoiding stalls.
>

Yes, kvcalloc looks better.

>> +	if (!data) {
>> +		printk_once(KERN_DEBUG
>> +			    "Failed to allocate space for LBR callstack. "
>> +			    "The LBR callstack for all tasks may be cutoff.\n");
>> +		return -ENOMEM;
>> +	}
>> +
>> +	atomic_inc(&nr_task_data_sys_wide_events);
>> +
>> +repeat:
>> +	/*
>> +	 * Allocate perf_ctx_data for all existing threads.
>> +	 * The perf_ctx_data for new threads will be allocated in
>> +	 * perf_event_fork().
>> +	 * Do a quick allocation in first round with GFP_ATOMIC.
>> +	 */
>> +	for (i = 0; i < num_thread; i++) {
>> +		if (alloc_perf_ctx_data(ctx_size, flags, &data[i]))
>> +			break;
>> +	}
>> +	num_thread = i;
>> +	nr_failed_alloc = 0;
>> +	pos = 0;
>> +
> 
>> +	rcu_read_lock();
>> +	for_each_process_thread(g, p) {
>> +		raw_spin_lock(&p->perf_ctx_data_lock);
>> +		tsk_data = p->perf_ctx_data;
>> +		if (tsk_data) {
> 
> That will be a lot of locks even for tasks that don't use perf, but I guess we
> really need it and it's bounded by the number of tasks.

Right. We don't know which tasks will be monitored later. So we have to 
attach the perf_ctx_data for all of them. The per-task lock is required 
to sync the writers of perf_ctx_data RCU pointer.


Thanks,
Kan
> 
>> +		}
>> +
>> +		if (pos < num_thread) {
>> +			refcount_set(&data[pos]->refcount, TASK_DATA_SYS_WIDE);
>> +			rcu_assign_pointer(p->perf_ctx_data, data[pos++]);
>> +		} else {
>> +			/*
>> +			 * The quick allocation in first round may be failed.
>> +			 * Track the number in nr_failed_alloc.
>> +			 */
>> +			nr_failed_alloc++;
>> +		}
>> +		raw_spin_unlock(&p->perf_ctx_data_lock);
>> +	}
>> +	rcu_read_unlock();
> 
> 
> -Andi
>