[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <82609267-8fc6-5b3d-c931-c0d93ab14788@gnuweeb.org>
Date: Mon, 28 Mar 2022 11:12:53 +0700
From: Ammar Faizi <ammarfaizi2@...weeb.org>
To: Borislav Petkov <bp@...en8.de>
Cc: Thomas Gleixner <tglx@...utronix.de>,
Alviro Iskandar Setiawan <alviro.iskandar@...il.com>,
Alviro Iskandar Setiawan <alviro.iskandar@...weeb.org>,
Dave Hansen <dave.hansen@...ux.intel.com>,
Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
"H. Peter Anvin" <hpa@...or.com>, Ingo Molnar <mingo@...hat.com>,
Tony Luck <tony.luck@...el.com>,
Yazen Ghannam <yazen.ghannam@....com>,
linux-edac@...r.kernel.org, linux-kernel@...r.kernel.org,
stable@...r.kernel.org, gwml@...r.gnuweeb.org, x86@...nel.org
Subject: Re: [PATCH v5 2/2] x86/MCE/AMD: Fix memory leak when
`threshold_create_bank()` fails
On 3/28/22 5:52 AM, Borislav Petkov wrote:
[...]
>> Fixes: 6458de97fc15 ("x86/mce/amd: Straighten CPU hotplug path")
>
> How did you decide this is the commit that this is fixing?
I examined the history in those lines by git blame. Will recheck after the below
doubt is cleared.
>> Link: https://lore.kernel.org/lkml/9dfe087a-f941-1bc4-657d-7e7c198888ff@gnuweeb.org
>
> That Link tag is not needed.
>
>> Co-authored-by: Alviro Iskandar Setiawan <alviro.iskandar@...weeb.org>
>> Signed-off-by: Alviro Iskandar Setiawan <alviro.iskandar@...weeb.org>
>> Co-authored-by: Yazen Ghannam <yazen.ghannam@....com>
>
> There's no "Co-authored-by".
>
> The correct tag is described in
>
> Documentation/process/submitting-patches.rst
Will fix them in the v6.
> ...
>
>> @@ -1350,15 +1357,14 @@ int mce_threshold_create_device(unsigned int cpu)
>> if (!(this_cpu_read(bank_map) & (1 << bank)))
>> continue;
>> err = threshold_create_bank(bp, cpu, bank);
>> - if (err)
>> - goto out_err;
>> + if (err) {
>> + _mce_threshold_remove_device(bp, numbanks);
>> + return err;
>> + }
>> }
>> this_cpu_write(threshold_banks, bp);
>
> Do I see it correctly that the publishing of the @bp pointer - i.e.,
> this line - should be moved right above the for loop?
>
> Then mce_threshold_remove_device() would properly free it in the error
> case and your patch turns into a oneliner?
Previously, in v4 I did that too. But after discussion with Yazen, we got a
conclusion that placing `this_cpu_write(threshold_banks, bp);` before the for loop
is not the right thing to do.
> And then your Fixes: tag would be correct too...
The reason is based on the discussion with Yazen, the full discussion can be read in
the Link tag above.
==================
The point is:
On Wed, 2 Mar 2022 17:26:32 +0000, Yazen Ghannam <yazen.ghannam@....com> wrote:
> The threshold interrupt handler uses this pointer. I think the goal here is to
> set this pointer when the list is fully formed and clear this pointer before
> making any changes to the list. Otherwise, the interrupt handler will operate
> on incomplete data if an interrupt comes in the middle of these updates.
==================
Also, looking at the comment in mce_threshold_remove_device() function:
/*
* Clear the pointer before cleaning up, so that the interrupt won't
* touch anything of this.
*/
this_cpu_write(threshold_banks, NULL);
I think it's reasonable to place `this_cpu_write(threshold_banks, bp);` after
the "for loop" on the creation process for the similar reason. In short, don't
let the interrupt sees incomplete data.
Although, I am not sure if that 100% guarantees mce_threshold_remove_device()
will not mess up with the interrupt (e.g. freeing the data while the interrupt
reading it), unless we're using RCU stuff.
What do you think?
--
Ammar Faizi
Powered by blists - more mailing lists