linux-kernel - Re: [RFC V2 2/2] sched: idle: IRQ based next prediction for idle period

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <56A0E31A.6020001@linaro.org>
Date:	Thu, 21 Jan 2016 14:54:34 +0100
From:	Daniel Lezcano <daniel.lezcano@...aro.org>
To:	Thomas Gleixner <tglx@...utronix.de>
Cc:	peterz@...radead.org, rafael@...nel.org, linux-pm@...r.kernel.org,
	linux-kernel@...r.kernel.org, nicolas.pitre@...aro.org,
	vincent.guittot@...aro.org
Subject: Re: [RFC V2 2/2] sched: idle: IRQ based next prediction for idle
 period

On 01/20/2016 08:49 PM, Thomas Gleixner wrote:

[ ... ]

Thanks for all your comments. I agree with them.

One question below.

>> +static void sched_irq_timing_free(unsigned int irq)
>> +{
>> +	struct wakeup *w;
>> +	int cpu;
>> +
>> +	for_each_possible_cpu(cpu) {
>> +
>> +		w = per_cpu(wakeups[irq], cpu);
>> +		if (!w)
>> +			continue;
>> +
>> +		per_cpu(wakeups[irq], cpu) = NULL;
>> +		kfree(w);
>
> Your simple array does not work. You need a radix_tree to handle SPARSE_IRQ
> and you need proper protection against teardown.
>
> So we can avoid all that stuff and simply stick that data into irqdesc and let
> the core handle it. That allows us to use proper percpu allocations and avoid
> that for_each_possible_cpu() sillyness.
>
> That leaves the iterator, but that's a solvable problem. We simply can have an
> iterator function in the irq core, which gives you the next sample
> structure. Something like this:
>
> struct irqtiming_sample *irqtiming_get_next(int *irq)
> {
> 	struct irq_desc *desc;
> 	int next;
>
> 	/* Do a racy lookup of the next allocated irq */
> 	next = irq_get_next_irq(*irq);
> 	if (next >= nr_irqs)
> 	   	 return NULL;
>
> 	*irq = next + 1;
>
> 	/* Now lookup the descriptor. It's RCU protected. */
> 	desc = irq_to_desc(next);
> 	if (!desc || !desc->irqtimings || !(desc->istate & IRQS_TIMING))
> 	   	 return NULL;
>
> 	return this_cpu_ptr(&desc->irqtimings);
> }
>
> And that needs to be called rcu protected;
>
>      	 next = 0;
>      	 rcu_read_lock();
> 	 sample = irqtiming_get_next(&next);
> 	 while (sample) {
> 	       ....
> 	       sample = irqtiming_get_next(&next);
> 	 }
>      	 rcu_read_unlock();
>
> So the interrupt part becomes:
>
>           if (desc->istate & IRQS_TIMING)
> 	       	 irqtimings_handle(__this_cpu_ptr(&desc->irqtimings));
>
> So now for the allocation/free of that data. We simply allocate/free it along
> with the irq descriptor. That IRQS_TIMING bit gets set in __setup_irq() except
> for timer interrupts. That's simple and avoid _all_ the issues.

Indeed, making this as part of the irq code makes everything much more 
simple and self contained. For the shared interrupts, shouldn't we put 
the timings samples into the irqaction structure instead of the irqdesc 
structure ?

eg.

#define IRQT_MAX_VALUES 4

struct irqaction {
	...
#ifdef CONFIG_IRQ_TIMINGS
	u32 irqtimings_samples[IRQT_MAX_VALUES];
#endif
	...
};

So we don't have to deal with the allocation/free under locks. The 
drawback is the array won't be used in the case of the timers.

Does it make sense ?

Thanks Thomas for your help, your time and your suggestions.




-- 
  <http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs

Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog