[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ogv7evtgf5rcljd4ev7rxx6buqg7y3kwlqing2tfshh2ac5zf6@juclnksnv4ki>
Date: Mon, 31 Jul 2023 19:46:01 -0700
From: Davidlohr Bueso <dave@...olabs.net>
To: Dan Williams <dan.j.williams@...el.com>
Cc: Ira Weiny <ira.weiny@...el.com>,
Alison Schofield <alison.schofield@...el.com>,
Vishal Verma <vishal.l.verma@...el.com>,
Dave Jiang <dave.jiang@...el.com>,
Jonathan Cameron <Jonathan.Cameron@...wei.com>,
linux-cxl@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH] cxl/memdev: Avoid mailbox functionality on device memory
CXL devices
On Fri, 28 Jul 2023, Dan Williams wrote:
>Ira Weiny wrote:
>> Using the proposed type-2 cxl-test device[1] the following
>> splat was observed:
>>
>> BUG: kernel NULL pointer dereference, address: 0000000000000278
>> [...]
>> RIP: 0010:devm_cxl_add_memdev+0x1de/0x2c0 [cxl_core]
>
>It would be useful to decode this to a line number, the rest of this
>call trace is not adding much.
>
>> [...]
>> Call Trace:
>> <TASK>
>> ? __die+0x1f/0x70
>> ? page_fault_oops+0x149/0x420
>> ? fixup_exception+0x22/0x310
>> ? kernelmode_fixup_or_oops+0x84/0x110
>> ? exc_page_fault+0x6d/0x150
>> ? asm_exc_page_fault+0x22/0x30
>> ? devm_cxl_add_memdev+0x1de/0x2c0 [cxl_core]
>> cxl_mock_mem_probe+0x632/0x870 [cxl_mock_mem]
>> platform_probe+0x40/0x90
>> really_probe+0x19e/0x3e0
>> ? __pfx___driver_attach+0x10/0x10
>> __driver_probe_device+0x78/0x160
>> driver_probe_device+0x1f/0x90
>> __driver_attach+0xce/0x1c0
>> bus_for_each_dev+0x63/0xa0
>> bus_add_driver+0x112/0x210
>> driver_register+0x55/0x100
>> ? __pfx_cxl_mock_mem_driver_init+0x10/0x10 [cxl_mock_mem]
>> [...]
>>
>> Commit f6b8ab32e3ec made the mailbox functionality optional. However,
>> some mailbox functionality was merged after that patch. Therefore some
>> mailbox functionality can be accessed on a device which did not set up
>> the mailbox.
>
>cxl_memdev_security_init() definitely needs to move out of
>devm_cxl_add_memdev() and after that I do not think @mds NULL checks
>need to be sprinkled everywhere. In other words something is wrong at a
>higher level if we get into some of these helper functions without the
>memory device state.
Right, so we can move it directly into cxl_pci_probe() - just as with other
mbox based functionality. This leaves me wondering, however, what to do about
the cxl_memdev_security_shutdown() counterpart. As with the below diff, leaving
it as is and just adding a mds nil check might still be considering a layering
violation in that it would be asymmetrical wrt to the init; but this is tightly
coupled with cxl_memdev_unregister().
Ira does the below fix the crash?
Thanks,
Davidlohr
----8<-------
diff --git a/drivers/cxl/core/memdev.c b/drivers/cxl/core/memdev.c
index 14b547c07f54..4d1bf80c0e54 100644
--- a/drivers/cxl/core/memdev.c
+++ b/drivers/cxl/core/memdev.c
@@ -561,7 +561,7 @@ static void cxl_memdev_security_shutdown(struct device *dev)
struct cxl_memdev *cxlmd = to_cxl_memdev(dev);
struct cxl_memdev_state *mds = to_cxl_memdev_state(cxlmd->cxlds);
- if (mds->security.poll)
+ if (mds && mds->security.poll)
cancel_delayed_work_sync(&mds->security.poll_dwork);
}
@@ -1009,11 +1009,11 @@ static void put_sanitize(void *data)
sysfs_put(mds->security.sanitize_node);
}
-static int cxl_memdev_security_init(struct cxl_memdev *cxlmd)
+int cxl_memdev_security_state_init(struct cxl_memdev_state *mds)
{
- struct cxl_dev_state *cxlds = cxlmd->cxlds;
- struct cxl_memdev_state *mds = to_cxl_memdev_state(cxlds);
- struct device *dev = &cxlmd->dev;
+
+ struct cxl_dev_state *cxlds = &mds->cxlds;
+ struct device *dev = &cxlds->cxlmd->dev;
struct kernfs_node *sec;
sec = sysfs_get_dirent(dev->kobj.sd, "security");
@@ -1029,7 +1029,8 @@ static int cxl_memdev_security_init(struct cxl_memdev *cxlmd)
}
return devm_add_action_or_reset(cxlds->dev, put_sanitize, mds);
- }
+}
+EXPORT_SYMBOL_NS_GPL(cxl_memdev_security_state_init, CXL);
struct cxl_memdev *devm_cxl_add_memdev(struct cxl_dev_state *cxlds)
{
@@ -1059,10 +1060,6 @@ struct cxl_memdev *devm_cxl_add_memdev(struct cxl_dev_state *cxlds)
if (rc)
goto err;
- rc = cxl_memdev_security_init(cxlmd);
- if (rc)
- goto err;
-
rc = devm_add_action_or_reset(cxlds->dev, cxl_memdev_unregister, cxlmd);
if (rc)
return ERR_PTR(rc);
diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
index f86afef90c91..441270770519 100644
--- a/drivers/cxl/cxlmem.h
+++ b/drivers/cxl/cxlmem.h
@@ -884,6 +884,7 @@ static inline void cxl_mem_active_dec(void)
#endif
int cxl_mem_sanitize(struct cxl_memdev_state *mds, u16 cmd);
+int cxl_memdev_security_state_init(struct cxl_memdev_state *mds);
struct cxl_hdm {
struct cxl_component_regs regs;
diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
index 1cb1494c28fe..5242dbf0044d 100644
--- a/drivers/cxl/pci.c
+++ b/drivers/cxl/pci.c
@@ -887,6 +887,10 @@ static int cxl_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
if (IS_ERR(cxlmd))
return PTR_ERR(cxlmd);
+ rc = cxl_memdev_security_state_init(mds);
+ if (rc)
+ return rc;
+
rc = cxl_memdev_setup_fw_upload(mds);
if (rc)
return rc;
Powered by blists - more mailing lists