lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <cdabc034-1365-d254-8ce5-b5a70d45a28e@arm.com>
Date:   Mon, 6 Mar 2023 11:34:16 +0000
From:   James Morse <james.morse@....com>
To:     Reinette Chatre <reinette.chatre@...el.com>, x86@...nel.org,
        linux-kernel@...r.kernel.org
Cc:     Fenghua Yu <fenghua.yu@...el.com>,
        Thomas Gleixner <tglx@...utronix.de>,
        Ingo Molnar <mingo@...hat.com>, Borislav Petkov <bp@...en8.de>,
        H Peter Anvin <hpa@...or.com>,
        Babu Moger <Babu.Moger@....com>,
        shameerali.kolothum.thodi@...wei.com,
        D Scott Phillips OS <scott@...amperecomputing.com>,
        carl@...amperecomputing.com, lcherian@...vell.com,
        bobo.shaobowang@...wei.com, tan.shaopeng@...itsu.com,
        xingxin.hx@...nanolis.org, baolin.wang@...ux.alibaba.com,
        Jamie Iles <quic_jiles@...cinc.com>,
        Xin Hao <xhao@...ux.alibaba.com>, peternewman@...gle.com
Subject: Re: [PATCH v2 16/18] x86/resctrl: Allow overflow/limbo handlers to be
 scheduled on any-but cpu

Hi Reinette,

On 02/02/2023 23:49, Reinette Chatre wrote:
> On 1/13/2023 9:54 AM, James Morse wrote:
>> When a cpu is taken offline resctrl may need to move the overflow or
>> limbo handlers to run on a different CPU.
>> Once the offline callbacks have been split, cqm_setup_limbo_handler()
>> will be called while the CPU that is going offline is still present
>> in the cpu_mask.
>>
>> Pass the CPU to exclude to cqm_setup_limbo_handler() and
>> mbm_setup_overflow_handler(). These functions can use cpumask_any_but()
>> when selecting the CPU. -1 is used to indicate no CPUs need excluding.

>> diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c
>> index 1a214bd32ed4..334fb3f1c6e2 100644
>> --- a/arch/x86/kernel/cpu/resctrl/monitor.c
>> +++ b/arch/x86/kernel/cpu/resctrl/monitor.c

>> @@ -773,15 +773,27 @@ void cqm_handle_limbo(struct work_struct *work)
>>  	mutex_unlock(&rdtgroup_mutex);
>>  }
>>  
>> -void cqm_setup_limbo_handler(struct rdt_domain *dom, unsigned long delay_ms)
>> +/**
>> + * cqm_setup_limbo_handler() - Schedule the limbo handler to run for this
>> + *                             domain.
>> + * @delay_ms:      How far in the future the handler should run.
>> + * @exclude_cpu:   Which CPU the handler should not run on, -1 to pick any CPU.
>> + */
>> +void cqm_setup_limbo_handler(struct rdt_domain *dom, unsigned long delay_ms,
>> +			     int exclude_cpu)
>>  {
>>  	unsigned long delay = msecs_to_jiffies(delay_ms);
>>  	int cpu;
>>  
>> -	cpu = cpumask_any(&dom->cpu_mask);
>> +	if (exclude_cpu == -1)
>> +		cpu = cpumask_any(&dom->cpu_mask);
>> +	else
>> +		cpu = cpumask_any_but(&dom->cpu_mask, exclude_cpu);
>> +
>>  	dom->cqm_work_cpu = cpu;
>>  
> 
> This assignment is unexpected considering the error handling that follows.
> cqm_work_cpu can thus be >= nr_cpu_ids. I assume it is to help during
> domain remove where the CPU being removed is checked against this value?
> If indeed this invalid CPU assignment is done in support of future code
> path, could you please add a comment to help explain this assignment?

Looks like I ignored it because in the last-man-standing case, the domain is going to get
free()d anyway ... but I couldn't find a 'cpu >= nr_cpu_ids' check under
schedule_delayed_work_on() hence the error handling.

I'll move the dom->mbm_work_cpu under the nr_cpu_ids check too so that it doesn't look funny.


Thanks,

James

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ