lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <a7dd72c5-ed0a-4270-b3a7-5775037703e4@linux.alibaba.com>
Date: Mon, 19 Feb 2024 17:12:06 +0800
From: Bitao Hu <yaoma@...ux.alibaba.com>
To: Thomas Gleixner <tglx@...utronix.de>
Cc: linux-kernel@...r.kernel.org, Doug Anderson <dianders@...omium.org>,
 akpm@...ux-foundation.org, Petr Mladek <pmladek@...e.com>,
 kernelfans@...il.com, Liu Song <liusong@...ux.alibaba.com>,
 yaoma@...ux.alibaba.com
Subject: Re: [PATCHv7 2/2] watchdog/softlockup: report the most frequent
 interrupts

Hi,

On 2024/2/15 19:30, Thomas Gleixner wrote:
> On Wed, Feb 14 2024 at 10:14, Bitao Hu wrote:
>> +static void start_counting_irqs(void)
>> +{
>> +	int i;
>> +	int local_nr_irqs;
>> +	struct irq_desc *desc;
>> +	u32 *counts = __this_cpu_read(hardirq_counts);
>> +
>> +	if (!counts) {
>> +		/*
>> +		 * nr_irqs has the potential to grow at runtime. We should read
>> +		 * it and store locally to avoid array out-of-bounds access.
>> +		 */
>> +		local_nr_irqs = nr_irqs;
>> +		counts = kcalloc(local_nr_irqs, sizeof(u32), GFP_ATOMIC);
> 
> Seriously? The system has a problem and you allocate memory from the
> detection code in hard interrupt context?
I want all the changes for this feature to be concentrated within the
watchdog module, and I am also unsure whether modifying the irq code
for this feature would be justified. Hence, I opted for this approach.
However, your reply on V1 demonstrated the proper way to do it, so I
will refactor accordingly.

>> +		for (i = 0; i < NUM_HARDIRQ_REPORT; i++) {
>> +			if (irq_counts_sorted[i].irq == -1)
>> +				break;
>> +
>> +			desc = irq_to_desc(irq_counts_sorted[i].irq);
>> +			if (desc && desc->action)
>> +				printk(KERN_CRIT "\t#%u: %-10u\tirq#%d(%s)\n",
>> +				       i + 1, irq_counts_sorted[i].counts,
>> +				       irq_counts_sorted[i].irq, desc->action->name);
> 
> You cannot dereference desc->action here:
> 
>      1) It can be NULL'ed between check and dereference.
> 
>      2) Both 'action' and 'action->name' can be freed in parallel
> 
> And no, you cannot take desc->lock here to prevent this. Stop fiddling
> in the internals of interrupt descriptors.
Thanks for your analysis. However, I have a question. 'action->name'
cannot be accessed here, and it seems that merely outputting the
irq number provides insufficient information?

> 
> 
> and the analysis boils down to:
> 
>          u64 cnt, sorted[3] = {};
>          unsigned int irq, i;
> 
>      	for_each_active_irq(irq) {
>          	cnt = kstat_get_irq_since_snapshot(irq);
> 
> 		if (cnt) {
>                       	for (cnt = (cnt << 32) + irq, i = 0; i < 3; i++) {
>                  		if (cnt > sorted[i])
>                          		swap(cnt, sorted[i]);
Hmm, I think the approach here isn't optimal. If some interrupts
have the same count, then it effectively results in sorting by the
irq number. Is my understanding correct?

Best Regards,
	Bitao


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ