lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <115c3b08-aff1-dd97-fe6a-7901452ce62c@huaweicloud.com>
Date: Wed, 9 Apr 2025 17:27:11 +0800
From: Yu Kuai <yukuai1@...weicloud.com>
To: Christoph Hellwig <hch@....de>, Yu Kuai <yukuai1@...weicloud.com>
Cc: xni@...hat.com, colyli@...nel.org, axboe@...nel.dk, agk@...hat.com,
 snitzer@...nel.org, mpatocka@...hat.com, song@...nel.org,
 linux-block@...r.kernel.org, linux-kernel@...r.kernel.org,
 dm-devel@...ts.linux.dev, linux-raid@...r.kernel.org, yi.zhang@...wei.com,
 yangerkun@...wei.com, kbusch@...nel.org, "yukuai (C)" <yukuai3@...wei.com>
Subject: Re: [PATCH RFC v2 00/14] md: introduce a new lockless bitmap

Hi,

在 2025/04/09 16:32, Christoph Hellwig 写道:
> On Sat, Mar 29, 2025 at 09:11:13AM +0800, Yu Kuai wrote:
>> The purpose here is to hide the low level bitmap IO implementation to
>> the API disk->submit_bio(), and the bitmap IO can be converted to buffer
>> IO to the bdev_file. This is the easiest way that I can think of to
>> resue the pagecache, with natural ability for dirty page writeback. I do
>> think about creating a new anon file and implement a new
>> file_operations, this will be much more complicated.
> 
> I've started looking at this a bit now, sorry for the delay.
> 
> As far as I can see you use the bitmap file just so that you have your
> own struct address_space and thus page cache instance and then call
> read_mapping_page and filemap_write_and_wait_range on it right?
Yes.

> 
> For that you'd be much better of just creating your own trivial
> file_system_type with an inode fully controlled by your driver
> that has a trivial set of address_space ops instead of oddly
> mixing with the block layer.

Yes, this is exactly what I said implement a new file_operations(and
address_space ops), I wanted do this the easy way, just reuse the raw
block device ops, this way I just need to implement the submit_bio ops
for new hidden disk.

I can try with new fs type if we really think this solution is too
hacky, however, the code line will be much more. :(

> 
> Note that either way I'm not sure using the page cache here is an
> all that good idea, as we're at the bottom of the I/O stack and
> thus memory allocations can very easily deadlock.

Yes, for the page from bitmap, this set do the easy way just read and
ping all realted pages while loading the bitmap. For two reasons:

1) We don't need to allocate and read pages from IO path;(In the first
RFC version, I'm using a worker to do that).
2) In the first RFC version, I find and get page in the IO path, turns
out page reference is an *atomic*, and the overhead is not acceptable;

And the only action from IO path is that if bitmap page is dirty,
filemap_write_and_wait_range() is called from async worker, the same as
old bitmap, to flush bitmap dirty pages.
> 
> What speaks against using your own folios explicitly allocated at
> probe time and then just doing manual submit_bio on that?  That's
> probably not much more code but a lot more robust.

I'm not quite sure if I understand you correctly. Do you means don't use
pagecache for bitmap IO, and manually create BIOs like the old bitmap,
meanwhile invent a new solution for synchronism instead of the global
spin_lock from old bitmap?

Thanks,
Kuai

> 
> Also a high level note: the bitmap_operations aren't a very nice
> interface.  A lot of methods are empty and should just be called
> conditionally.  Or even better you'd do away with the expensive
> indirect calls and just directly call either the old or new
> bitmap code.
> 
>> Meanwhile, bitmap file for the old bitmap will be removed sooner or
>> later, and this bdev_file implementation will compatible with bitmap
>> file as well.
> 
> Which would also mean that at that point the operations vector would
> be pointless, so we might as well not add it to start with.
> 
> .
> 


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ