[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <YFI299oMXylsG9kB@mit.edu>
Date: Wed, 17 Mar 2021 13:05:59 -0400
From: "Theodore Ts'o" <tytso@....edu>
To: Shashidhar Patil <shashidhar.patil@...il.com>
Cc: linux-ext4@...r.kernel.org
Subject: Re: jbd2 task hung in jbd2_journal_commit_transaction
On Wed, Mar 17, 2021 at 08:30:56PM +0530, Shashidhar Patil wrote:
> Hi Theodore,
> Thank you for the details about the journalling layer and
> insight into the block device layer.
> I think Good luck might have clicked. The swap file in our case is
> attached to a loop block device before enabling swap using swapon.
> Since loop driver processes its IO requests by calling
> vfs_iter_write() the write requests re-enter the ext4
> filesystem/journalling code.
> Is that right ? There seems to be a possibility of cylic dependency.
If that hypothesis is correct, you should see an example of that in
one of your stack traces; do you? The loop device creates struct file
where the file is opened using O_DIRECT. In the O_DIRECT code path,
assuming the file was fully allocate and initialized, it shouldn't
involve starting a journal handle.
That being said, why are you using a loop device for a swap device at
all? Using a swap file directly is going to be much more efficient,
and decrease the stack depth and CPU cycles needed to do a swap out if
nothing else. If you can reliably reproduce the problem, what happens
if you use a swap file directly and cut out the loop device as a swap
device? Does it make the problem go away?
- Ted
Powered by blists - more mailing lists