[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <f6878438-8fcf-4f78-88f5-e7f275b157eb@amd.com>
Date: Fri, 29 Nov 2024 16:02:19 +0530
From: Bharata B Rao <bharata@....com>
To: Mateusz Guzik <mjguzik@...il.com>
Cc: Matthew Wilcox <willy@...radead.org>, linux-block@...r.kernel.org,
linux-kernel@...r.kernel.org, linux-fsdevel@...r.kernel.org,
linux-mm@...ck.org, nikunj@....com, vbabka@...e.cz, david@...hat.com,
akpm@...ux-foundation.org, yuzhao@...gle.com, axboe@...nel.dk,
viro@...iv.linux.org.uk, brauner@...nel.org, jack@...e.cz,
joshdon@...gle.com, clm@...a.com
Subject: Re: [RFC PATCH 0/1] Large folios in block buffered IO path
On 29-Nov-24 5:01 AM, Mateusz Guzik wrote:
> On Thu, Nov 28, 2024 at 12:24 PM Bharata B Rao <bharata@....com> wrote:
>>
>> On 28-Nov-24 10:07 AM, Bharata B Rao wrote:
>>> On 28-Nov-24 9:52 AM, Matthew Wilcox wrote:
>>>> On Thu, Nov 28, 2024 at 09:31:50AM +0530, Bharata B Rao wrote:
>>>>> However a point of concern is that FIO bandwidth comes down drastically
>>>>> after the change.
>>>>>
>>>>> default inode_lock-fix
>>>>> rw=30%
>>>>> Instance 1 r=55.7GiB/s,w=23.9GiB/s r=9616MiB/s,w=4121MiB/s
>>>>> Instance 2 r=38.5GiB/s,w=16.5GiB/s r=8482MiB/s,w=3635MiB/s
>>>>> Instance 3 r=37.5GiB/s,w=16.1GiB/s r=8609MiB/s,w=3690MiB/s
>>>>> Instance 4 r=37.4GiB/s,w=16.0GiB/s r=8486MiB/s,w=3637MiB/s
>>>>
>>>> Something this dramatic usually only happens when you enable a debugging
>>>> option. Can you recheck that you're running both A and B with the same
>>>> debugging options both compiled in, and enabled?
>>>
>>> It is the same kernel tree with and w/o Mateusz's inode_lock changes to
>>> block/fops.c. I see the config remains same for both the builds.
>>>
>>> Let me get a run for both base and patched case w/o running perf lock
>>> contention to check if that makes a difference.
>>
>> Without perf lock contention
>>
>> default inode_lock-fix
>> rw=30%
>> Instance 1 r=54.6GiB/s,w=23.4GiB/s r=11.4GiB/s,w=4992MiB/s
>> Instance 2 r=52.7GiB/s,w=22.6GiB/s r=11.4GiB/s,w=4981MiB/s
>> Instance 3 r=53.3GiB/s,w=22.8GiB/s r=12.7GiB/s,w=5575MiB/s
>> Instance 4 r=37.7GiB/s,w=16.2GiB/s r=10.4GiB/s,w=4581MiB/s
>>
>
> per my other e-mail can you follow willy's suggestion and increase the hash?
With Mateusz's inode_lock fix and PAGE_WAIT_TABLE_BITS value of 10, 14,
16 and 20.
(Two values given with each instance below are FIO READ bw and WRITE bw)
10 14 16 20
rw=30%
Instance 1 11.3GiB/s 14.2GiB/s 14.8GiB/s 14.9GiB/s
4965MiB/s 6225MiB/s 6487MiB/s 6552MiB/s
Instance 2 12.3GiB/s 10.4GiB/s 10.9GiB/s 11.0GiB/s
5389MiB/s 4548MiB/s 4770MiB/s 4815MiB/s
Instance 3 11.1GiB/s 12.3GiB/s 11.2GiB/s 13.5GiB/s
4864MiB/s 5410MiB/s 4923MiB/s 5927MiB/s
Instance 4 12.3GiB/s 13.7GiB/s 13.0GiB/s 11.4GiB/s
5404MiB/s 6004MiB/s 5689MiB/s 5007MiB/s
Number of hash buckets don't seem to matter all that much in this case.
Regards,
Bharata.
Powered by blists - more mailing lists