lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Thu, 27 Aug 2020 14:02:27 +0000
From:   Shiju Jose <shiju.jose@...wei.com>
To:     Borislav Petkov <bp@...en8.de>
CC:     "linux-edac@...r.kernel.org" <linux-edac@...r.kernel.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        "mchehab@...nel.org" <mchehab@...nel.org>,
        "tony.luck@...el.com" <tony.luck@...el.com>,
        "james.morse@....com" <james.morse@....com>,
        "rrichter@...vell.com" <rrichter@...vell.com>,
        Linuxarm <linuxarm@...wei.com>
Subject: RE: [PATCH 1/1] EDAC/ghes: Fix for NULL pointer dereference in
 ghes_edac_register()

Hello Boris,

Thanks for reviewing.

>-----Original Message-----
>From: linux-edac-owner@...r.kernel.org [mailto:linux-edac-
>owner@...r.kernel.org] On Behalf Of Borislav Petkov
>Sent: 26 August 2020 09:52
>To: Shiju Jose <shiju.jose@...wei.com>
>Cc: linux-edac@...r.kernel.org; linux-kernel@...r.kernel.org;
>mchehab@...nel.org; tony.luck@...el.com; james.morse@....com;
>rrichter@...vell.com; Linuxarm <linuxarm@...wei.com>
>Subject: Re: [PATCH 1/1] EDAC/ghes: Fix for NULL pointer dereference in
>ghes_edac_register()
>
>On Tue, Aug 25, 2020 at 02:01:08PM +0100, Shiju Jose wrote:
>> After the 'commit b9cae27728d1 ("EDAC/ghes: Scan the system once on
>driver init")'
>> applied, following error has occurred in ghes_edac_register() when
>> CONFIG_DEBUG_TEST_DRIVER_REMOVE is enabled. The null
>ghes_hw.dimms
>> pointer in the mci_for_each_dimm() of ghes_edac_register() caused the
>error.
>>
>> The error occurs when all the previously initialized ghes instances
>> are removed and then probe a new ghes instance. In this case, the
>> ghes_refcount would be 0, ghes_hw.dimms and mci already freed. The
>> ghes_hw.dimms would be null because ghes_scan_system() would not call
>enumerate_dimms() again.
>
>Try the below instead and see if it fixes the issue for you too.
>
>If it does, pls send it as v2 but do not add the splat to the commit message -
>that's a lot of noise for something which is clear why it happens and you
>explain it properly in text anyway.

I tested with your changes and it fixes the issue.  I will send v2.
 
>
>Thx.
>
>---
>diff --git a/drivers/edac/ghes_edac.c b/drivers/edac/ghes_edac.c index
>da60c29468a7..54ebc8afc6b1 100644
>--- a/drivers/edac/ghes_edac.c
>+++ b/drivers/edac/ghes_edac.c
>@@ -55,6 +55,8 @@ static DEFINE_SPINLOCK(ghes_lock);  static bool
>__read_mostly force_load;  module_param(force_load, bool, 0);
>
>+static bool system_scanned;
>+
> /* Memory Device - Type 17 of SMBIOS spec */  struct memdev_dmi_entry {
> 	u8 type;
>@@ -225,14 +227,12 @@ static void enumerate_dimms(const struct
>dmi_header *dh, void *arg)
>
> static void ghes_scan_system(void)
> {
>-	static bool scanned;
>-
>-	if (scanned)
>+	if (system_scanned)
> 		return;
>
> 	dmi_walk(enumerate_dimms, &ghes_hw);
>
>-	scanned = true;
>+	system_scanned = true;
> }
>
> void ghes_edac_report_mem_error(int sev, struct cper_sec_mem_err
>*mem_err) @@ -631,6 +631,8 @@ void ghes_edac_unregister(struct ghes
>*ghes)
>
> 	mutex_lock(&ghes_reg_mutex);
>
>+	system_scanned = false;
>+
> 	if (!refcount_dec_and_test(&ghes_refcount))
> 		goto unlock;
>
>
>--
>Regards/Gruss,
>    Boris.
>
>https://people.kernel.org/tglx/notes-about-netiquette

Thanks,
Shiju

Powered by blists - more mailing lists