linux-kernel - Re: [PATCH v4 4/4] blk-flush: reuse rq queuelist in flush state machine

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <e437a447-33ac-465e-ac9b-7b9a9005c24e@linux.dev>
Date: Fri, 31 May 2024 16:16:37 +0800
From: Chengming Zhou <chengming.zhou@...ux.dev>
To: Christoph Hellwig <hch@....de>
Cc: Friedrich Weber <f.weber@...xmox.com>, axboe@...nel.dk,
 ming.lei@...hat.com, bvanassche@....org, linux-block@...r.kernel.org,
 linux-kernel@...r.kernel.org, zhouchengming@...edance.com
Subject: Re: [PATCH v4 4/4] blk-flush: reuse rq queuelist in flush state
 machine

On 2024/5/31 14:17, Christoph Hellwig wrote:
> On Wed, May 29, 2024 at 04:50:02PM +0800, Chengming Zhou wrote:
>> Yes, because we use list_move_tail() in the flush sequences. Maybe we can
>> just use list_add_tail() so we don't need the queuelist initialized. It
>> should be ok since rq can't be on any list when PREFLUSH or POSTFLUSH,
>> so there isn't any move actually.
> 
> Sounds good.

Ok, I could send a fix that changes to use list_add_tail() later.

> 
>> But now I'm concerned that rq->queuelist maybe changed by driver after
>> request end?
> 
> How could the driver change it?

I don't know much about drivers. Normally, they will detach rq->queuelist
from their internal list and do blk_mq_end_request(), in which we reuse
this queuelist to add rq to the post-flush list.

Strictly speaking, that rq's ownership still belongs to the drivers until
they call blk_mq_free_request(), right? So I'm not sure if the drivers
would touch rq->queuelist after blk_mq_end_request(). If the drivers don't
have such behaviors, then we are good.

> 
>>> Also, just out of interest: Can you estimate whether this issue is
>>> specific to software RAID setups, or could similar NULL pointer
>>> dereferences also happen in setups without software RAID?
>>
>> I think it can also happen without software RAID.
> 
> Seems to be about batch allocation.  So you either need a plug in
> the stacking device, or io_uring.  I guess people aren't using the
> io_uring high performance options on devices with a write cache
> all that much, as that should immediately reproduce the problem.
>