[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <6587152b64d9f_c579e29437@iweiny-mobl.notmuch>
Date: Sat, 23 Dec 2023 09:13:15 -0800
From: Ira Weiny <ira.weiny@...el.com>
To: Coly Li <colyli@...e.de>, Ira Weiny <ira.weiny@...el.com>,
<linan666@...weicloud.com>
CC: Dan Williams <dan.j.williams@...el.com>, Jens Axboe <axboe@...nel.dk>,
Xiao Ni <xni@...hat.com>, Geliang Tang <geliang.tang@...e.com>, "Hannes
Reinecke" <hare@...e.de>, NeilBrown <neilb@...e.de>, Vishal L Verma
<vishal.l.verma@...el.com>, <linux-block@...r.kernel.org>,
<nvdimm@...ts.linux.dev>, <linux-kernel@...r.kernel.org>
Subject: Re: Bug in commit aa511ff8218b ("badblocks: switch to the improved
badblock handling
Coly Li wrote:
[snip]
>
> Hi Ira,
>
> The above information is accurate and very helpful, thank you!
>
> From __badblocks_check(), the problematic code block is,
> 1303 re_check:
> 1304 bad.start = s;
> 1305 bad.len = sectors;
> 1306
> 1307 if (badblocks_empty(bb)) {
> 1308 len = sectors;
> 1309 goto update_sectors;
> 1310 }
> 1311
> 1312 prev = prev_badblocks(bb, &bad, hint);
> 1313
> 1314 /* start after all badblocks */
> 1315 if ((prev + 1) >= bb->count && !overlap_front(bb, prev, &bad)) {
> 1316 len = sectors;
> 1317 goto update_sectors;
> 1318 }
> 1319
> 1320 if (overlap_front(bb, prev, &bad)) {
> 1321 if (BB_ACK(p[prev]))
> 1322 acked_badblocks++;
> 1323 else
> 1324 unacked_badblocks++;
> 1325
> 1326 if (BB_END(p[prev]) >= (s + sectors))
> 1327 len = sectors;
> 1328 else
> 1329 len = BB_END(p[prev]) - s;
> 1330
> 1331 if (set == 0) {
> 1332 *first_bad = BB_OFFSET(p[prev]);
> 1333 *bad_sectors = BB_LEN(p[prev]);
> 1334 set = 1;
> 1335 }
> 1336 goto update_sectors;
> 1337 }
> 1338
> 1339 /* Not front overlap, but behind overlap */
> 1340 if ((prev + 1) < bb->count && overlap_behind(bb, &bad, prev + 1)) {
> 1341 len = BB_OFFSET(p[prev + 1]) - bad.start;
> 1342 hint = prev + 1;
> 1343 goto update_sectors;
> 1344 }
> 1345
> 1346 /* not cover any badblocks range in the table */
> 1347 len = sectors;
> 1348
> 1349 update_sectors:
>
> If the checking range is before all badblocks records in the badblocks table,
> value -1 is returned from prev_badblock(). Code blocks between line 1314 and
> line 1337 doesn't hanle the implicit '-1' value properly. Then counter
> unacked_badblocks is increased at line 1324 mistakenly.
>
> So the value prev should be checked and make sure '>= 0' before comparing
> the checking range with a badblock record returned by prev_badblocks(). Other
> wise it dones't make sense.
>
> For badblocks_set() and badblocks_clear(), 'prev < 0' is explicitly checked,
> value '-1' doesn't go though into following code.
>
> Could you please apply and try the attached patch? Hope it may help a bit.
>
> And now it is weekend time, you may be out of office and not able to access
> the testing hardware. I will do more testing from myside and update more info
> if necessary.
>
> Thanks for the report and debug!
>
> Coly Li
>
> [debug patch snipped]
This debug patch does fix our tests. Thanks!
But Nan has submitted a series to fix this as well.[1]
I'm going to test his series as well.
Thanks!
Ira
[1] https://lore.kernel.org/linux-block/20231223063728.3229446-1-linan666@huaweicloud.com/
Powered by blists - more mailing lists