lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <82609267-8fc6-5b3d-c931-c0d93ab14788@gnuweeb.org>
Date:   Mon, 28 Mar 2022 11:12:53 +0700
From:   Ammar Faizi <ammarfaizi2@...weeb.org>
To:     Borislav Petkov <bp@...en8.de>
Cc:     Thomas Gleixner <tglx@...utronix.de>,
        Alviro Iskandar Setiawan <alviro.iskandar@...il.com>,
        Alviro Iskandar Setiawan <alviro.iskandar@...weeb.org>,
        Dave Hansen <dave.hansen@...ux.intel.com>,
        Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
        "H. Peter Anvin" <hpa@...or.com>, Ingo Molnar <mingo@...hat.com>,
        Tony Luck <tony.luck@...el.com>,
        Yazen Ghannam <yazen.ghannam@....com>,
        linux-edac@...r.kernel.org, linux-kernel@...r.kernel.org,
        stable@...r.kernel.org, gwml@...r.gnuweeb.org, x86@...nel.org
Subject: Re: [PATCH v5 2/2] x86/MCE/AMD: Fix memory leak when
 `threshold_create_bank()` fails

On 3/28/22 5:52 AM, Borislav Petkov wrote:
[...]
>> Fixes: 6458de97fc15 ("x86/mce/amd: Straighten CPU hotplug path")
> 
> How did you decide this is the commit that this is fixing?

I examined the history in those lines by git blame. Will recheck after the below
doubt is cleared.

>> Link: https://lore.kernel.org/lkml/9dfe087a-f941-1bc4-657d-7e7c198888ff@gnuweeb.org
> 
> That Link tag is not needed.
> 
>> Co-authored-by: Alviro Iskandar Setiawan <alviro.iskandar@...weeb.org>
>> Signed-off-by: Alviro Iskandar Setiawan <alviro.iskandar@...weeb.org>
>> Co-authored-by: Yazen Ghannam <yazen.ghannam@....com>
> 
> There's no "Co-authored-by".
> 
> The correct tag is described in
> 
> Documentation/process/submitting-patches.rst

Will fix them in the v6.

> ...
> 
>> @@ -1350,15 +1357,14 @@ int mce_threshold_create_device(unsigned int cpu)
>>   		if (!(this_cpu_read(bank_map) & (1 << bank)))
>>   			continue;
>>   		err = threshold_create_bank(bp, cpu, bank);
>> -		if (err)
>> -			goto out_err;
>> +		if (err) {
>> +			_mce_threshold_remove_device(bp, numbanks);
>> +			return err;
>> +		}
>>   	}
>>   	this_cpu_write(threshold_banks, bp);
> 
> Do I see it correctly that the publishing of the @bp pointer - i.e.,
> this line - should be moved right above the for loop?
> 
> Then mce_threshold_remove_device() would properly free it in the error
> case and your patch turns into a oneliner?

Previously, in v4 I did that too. But after discussion with Yazen, we got a
conclusion that placing `this_cpu_write(threshold_banks, bp);` before the for loop
is not the right thing to do.

> And then your Fixes: tag would be correct too...
The reason is based on the discussion with Yazen, the full discussion can be read in
the Link tag above.

==================
The point is:

On Wed, 2 Mar 2022 17:26:32 +0000, Yazen Ghannam <yazen.ghannam@....com> wrote:
> The threshold interrupt handler uses this pointer. I think the goal here is to
> set this pointer when the list is fully formed and clear this pointer before
> making any changes to the list. Otherwise, the interrupt handler will operate
> on incomplete data if an interrupt comes in the middle of these updates.
==================

Also, looking at the comment in mce_threshold_remove_device() function:

	/*
	 * Clear the pointer before cleaning up, so that the interrupt won't
	 * touch anything of this.
	 */
	this_cpu_write(threshold_banks, NULL);

I think it's reasonable to place `this_cpu_write(threshold_banks, bp);` after
the "for loop" on the creation process for the similar reason. In short, don't
let the interrupt sees incomplete data.

Although, I am not sure if that 100% guarantees mce_threshold_remove_device()
will not mess up with the interrupt (e.g. freeing the data while the interrupt
reading it), unless we're using RCU stuff.

What do you think?

-- 
Ammar Faizi

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ