[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4bb8bfe1-5de6-4b5d-af90-ab24848c772b@gmail.com>
Date: Tue, 26 Nov 2024 09:01:35 +0100
From: Anders Blomdell <anders.blomdell@...il.com>
To: Philippe Troin <phil@...i.org>, Jan Kara <jack@...e.cz>,
"Matthew Wilcox (Oracle)" <willy@...radead.org>,
Andrew Morton <akpm@...ux-foundation.org>, linux-fsdevel@...r.kernel.org,
linux-mm@...ck.org, linux-kernel@...r.kernel.org
Subject: Re: Regression in NFS probably due to very large amounts of readahead
On 2024-11-26 02:48, Philippe Troin wrote:
> On Sat, 2024-11-23 at 23:32 +0100, Anders Blomdell wrote:
>> When we (re)started one of our servers with 6.11.3-200.fc40.x86_64,
>> we got terrible performance (lots of nfs: server x.x.x.x not
>> responding).
>> What triggered this problem was virtual machines with NFS-mounted
>> qcow2 disks
>> that often triggered large readaheads that generates long streaks of
>> disk I/O
>> of 150-600 MB/s (4 ordinary HDD's) that filled up the buffer/cache
>> area of the
>> machine.
>>
>> A git bisect gave the following suspect:
>>
>> git bisect start
>
> 8< snip >8
>
>> # first bad commit: [7c877586da3178974a8a94577b6045a48377ff25]
>> readahead: properly shorten readahead when falling back to
>> do_page_cache_ra()
>
> Thank you for taking the time to bisect, this issue has been bugging
> me, but it's been non-deterministic, and hence hard to bisect.
>
> I'm seeing the same problem on 6.11.10 (and earlier 6.11.x kernels) in
> slightly different setups:
>
> (1) On machines mounting NFSv3 shared drives. The symptom here is a
> "nfs server XXX not responding, still trying" that never recovers
> (while the server remains pingable and other NFSv3 volumes from the
> hanging server can be mounted).
>
> (2) On VMs running over qemu-kvm, I see very long stalls (can be up to
> several minutes) on random I/O. These stalls eventually recover.
>
> I've built a 6.11.10 kernel with
> 7c877586da3178974a8a94577b6045a48377ff25 reverted and I'm back to
> normal (no more NFS hangs, no more VM stalls).
>
> Phil.
Some printk debugging, seems to indicate that the problem
is that the entity 'ra->size - (index - start)' goes
negative, which then gets cast to a very large unsigned
'nr_to_read' when calling 'do_page_cache_ra'. Where the true
bug is still eludes me, though.
/Anders
Powered by blists - more mailing lists