[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <67db1c22365_551042948@iweiny-mobl.notmuch>
Date: Wed, 19 Mar 2025 14:33:54 -0500
From: Ira Weiny <ira.weiny@...el.com>
To: Robert Richter <rrichter@....com>, Vishal Verma
<vishal.l.verma@...el.com>, Ira Weiny <ira.weiny@...el.com>, Dan Williams
<dan.j.williams@...el.com>, Dave Jiang <dave.jiang@...el.com>
CC: Alison Schofield <alison.schofield@...el.com>, Jonathan Cameron
<Jonathan.Cameron@...wei.com>, <linux-cxl@...r.kernel.org>,
<linux-kernel@...r.kernel.org>, Davidlohr Bueso <dave@...olabs.net>, "Gregory
Price" <gourry@...rry.net>, Terry Bowman <terry.bowman@....com>, "Robert
Richter" <rrichter@....com>, <nvdimm@...ts.linux.dev>
Subject: Re: [PATCH] libnvdimm/labels: Fix divide error in
nd_label_data_init()
Robert Richter wrote:
> If a CXL memory device returns a broken zero LSA size in its memory
> device information (Identify Memory Device (Opcode 4000h), CXL
> spec. 3.1, 8.2.9.9.1.1), a divide error occurs in the libnvdimm
> driver:
>
> Oops: divide error: 0000 [#1] PREEMPT SMP NOPTI
> RIP: 0010:nd_label_data_init+0x10e/0x800 [libnvdimm]
>
> Code and flow:
>
> 1) CXL Command 4000h returns LSA size = 0,
> 2) config_size is assigned to zero LSA size (CXL pmem driver):
>
> drivers/cxl/pmem.c: .config_size = mds->lsa_size,
>
> 3) max_xfer is set to zero (nvdimm driver):
>
> drivers/nvdimm/label.c: max_xfer = min_t(size_t, ndd->nsarea.max_xfer, config_size);
> drivers/nvdimm/label.c: if (read_size < max_xfer) {
> drivers/nvdimm/label.c- /* trim waste */
>
> 4) DIV_ROUND_UP() causes division by zero:
>
> drivers/nvdimm/label.c: max_xfer -= ((max_xfer - 1) - (config_size - 1) % max_xfer) /
> drivers/nvdimm/label.c: DIV_ROUND_UP(config_size, max_xfer);
I think this is the wrong DIV_ROUND_UP which is failing because read_size is
never less than max_xfer is it?
I believe the failing DIV_ROUND_UP is after if statement here:
489 /* Make our initial read size a multiple of max_xfer size */
490 read_size = min(DIV_ROUND_UP(read_size, max_xfer) * max_xfer,
491 config_size);
Apparently nvdimm_get_config_data() was intended to check for this implicitly
but it is too late.
Anyway all this side tracked me a bit.
I assume this is a broken device which is in the real world? The fix looks
fine. But could you re-spin with a clean up of the commit message and I'll
queue it up.
Reviewed-by: Ira Weiny <ira.weiny@...el.com>
[snip]
Powered by blists - more mailing lists