[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <f922b3e8-eecb-42c9-8dba-f17f9bdd31b0@oracle.com>
Date: Tue, 11 Jun 2024 11:00:43 +0100
From: John Garry <john.g.garry@...cle.com>
To: "Pankaj Raghav (Samsung)" <kernel@...kajraghav.com>
Cc: david@...morbit.com, djwong@...nel.org, chandan.babu@...cle.com,
brauner@...nel.org, akpm@...ux-foundation.org, willy@...radead.org,
mcgrof@...nel.org, linux-mm@...ck.org, hare@...e.de,
linux-kernel@...r.kernel.org, yang@...amperecomputing.com,
Zi Yan <zi.yan@...t.com>, linux-xfs@...r.kernel.org,
p.raghav@...sung.com, linux-fsdevel@...r.kernel.org, hch@....de,
gost.dev@...sung.com, cl@...amperecomputing.com
Subject: Re: [PATCH v7 07/11] iomap: fix iomap_dio_zero() for fs bs > system
page size
On 11/06/2024 10:41, Pankaj Raghav (Samsung) wrote:
>>> 8419fcc7..9f791db473e4 100644
>>> --- a/fs/iomap/buffered-io.c
>>> +++ b/fs/iomap/buffered-io.c
>>> @@ -1990,6 +1990,12 @@ EXPORT_SYMBOL_GPL(iomap_writepages);
>>> static int __init iomap_init(void)
>>> {
>>> + int ret;
>>> +
>>> + ret = iomap_dio_init();
>>> + if (ret)
>>> + return ret;
>>> +
>>> return bioset_init(&iomap_ioend_bioset, 4 * (PAGE_SIZE / SECTOR_SIZE),
>>> offsetof(struct iomap_ioend, io_bio),
>>> BIOSET_NEED_BVECS);
>> I suppose that it does not matter that zero_fs_block is leaked if this fails
>> (or is it even leaked?), as I don't think that failing that bioset_init()
>> call is handled at all.
> If bioset_init fails, then we have even more problems than just a leaked
> 64k memory? 😉
>
Right
> Do you have something like this in mind?
I wouldn't propose anything myself. AFAICS, we don't gracefully handle
bioset_init() failing and iomap_ioend_bioset not being initialized properly.
> static struct bio *iomap_dio_alloc_bio(const struct iomap_iter *iter,
> struct iomap_dio *dio, unsigned short nr_vecs, blk_opf_t opf)
> {
>
>>> +
>>> static struct bio *iomap_dio_alloc_bio(const struct iomap_iter *iter,
>>> struct iomap_dio *dio, unsigned short nr_vecs, blk_opf_t opf)
>>> {
>>> @@ -236,17 +253,22 @@ static void iomap_dio_zero(const struct iomap_iter *iter, struct iomap_dio *dio,
>>> loff_t pos, unsigned len)
>>> {
>>> struct inode *inode = file_inode(dio->iocb->ki_filp);
>>> - struct page *page = ZERO_PAGE(0);
>>> struct bio *bio;
>>> + /*
>>> + * Max block size supported is 64k
>>> + */
>>> + WARN_ON_ONCE(len > ZERO_FSB_SIZE);
>> JFYI, As mentioned inhttps://urldefense.com/v3/__https://lore.kernel.org/linux-xfs/20240429174746.2132161-1-john.g.garry@oracle.com/T/*m5354e2b2531a5552a8b8acd4a95342ed4d7500f2__;Iw!!ACWV5N9M2RV99hQ!MTwVaC6oueHR_vgmDfOvgBX8bPdeTSRPcRcw5-CqtHnFEH-Ya1sUeZwaF-xrBF5XZ_8lJw5l-riq4t8IkfBhf2Q$ ,
>> we would like to support an arbitrary size. Maybe I will need to loop for
>> zeroing sizes > 64K.
> The initial patches were looping with a ZERO_PAGE(0), but the initial
> feedback was to use a huge zero page. But when I discussed that at LSF,
> the people thought we will be using a lot of memory for sub-block
> memory, especially on architectures with 64k base page size.
>
> So for now a good tradeoff between memory usage and efficiency was to
> use a 64k buffer as that is the maximum FSB we support.[1]
>
> IIUC, you will be using this function also to zero out the extent and
> not just a FSB?
Right. Or more specifically, the FS can ask for the zeroing size.
Typically it will be inode i_blocksize, but the FS can ask for a larger
size to zero out to extent alignment boundaries.
>
> I think we could resort to looping until we have a way to request
> arbitrary zero folios without having to allocate at it in
> iomap_dio_alloc_bio() for every IO.
>
ok
> [1]https://urldefense.com/v3/__https://lore.kernel.org/linux-xfs/20240529134509.120826-8-kernel@pankajraghav.com/__;!!ACWV5N9M2RV99hQ!MTwVaC6oueHR_vgmDfOvgBX8bPdeTSRPcRcw5-CqtHnFEH-Ya1sUeZwaF-xrBF5XZ_8lJw5l-riq4t8Ij2hl9yU$
Thanks,
John
Powered by blists - more mailing lists