lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date: Fri, 31 May 2024 16:16:37 +0800
From: Chengming Zhou <chengming.zhou@...ux.dev>
To: Christoph Hellwig <hch@....de>
Cc: Friedrich Weber <f.weber@...xmox.com>, axboe@...nel.dk,
 ming.lei@...hat.com, bvanassche@....org, linux-block@...r.kernel.org,
 linux-kernel@...r.kernel.org, zhouchengming@...edance.com
Subject: Re: [PATCH v4 4/4] blk-flush: reuse rq queuelist in flush state
 machine

On 2024/5/31 14:17, Christoph Hellwig wrote:
> On Wed, May 29, 2024 at 04:50:02PM +0800, Chengming Zhou wrote:
>> Yes, because we use list_move_tail() in the flush sequences. Maybe we can
>> just use list_add_tail() so we don't need the queuelist initialized. It
>> should be ok since rq can't be on any list when PREFLUSH or POSTFLUSH,
>> so there isn't any move actually.
> 
> Sounds good.

Ok, I could send a fix that changes to use list_add_tail() later.

> 
>> But now I'm concerned that rq->queuelist maybe changed by driver after
>> request end?
> 
> How could the driver change it?

I don't know much about drivers. Normally, they will detach rq->queuelist
from their internal list and do blk_mq_end_request(), in which we reuse
this queuelist to add rq to the post-flush list.

Strictly speaking, that rq's ownership still belongs to the drivers until
they call blk_mq_free_request(), right? So I'm not sure if the drivers
would touch rq->queuelist after blk_mq_end_request(). If the drivers don't
have such behaviors, then we are good.

> 
>>> Also, just out of interest: Can you estimate whether this issue is
>>> specific to software RAID setups, or could similar NULL pointer
>>> dereferences also happen in setups without software RAID?
>>
>> I think it can also happen without software RAID.
> 
> Seems to be about batch allocation.  So you either need a plug in
> the stacking device, or io_uring.  I guess people aren't using the
> io_uring high performance options on devices with a write cache
> all that much, as that should immediately reproduce the problem.
> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ