[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Yn0SHhnhB8fyd0jq@mit.edu>
Date: Thu, 12 May 2022 09:56:46 -0400
From: "Theodore Ts'o" <tytso@....edu>
To: Byungchul Park <byungchul.park@....com>
Cc: tj@...nel.org, torvalds@...ux-foundation.org,
damien.lemoal@...nsource.wdc.com, linux-ide@...r.kernel.org,
adilger.kernel@...ger.ca, linux-ext4@...r.kernel.org,
mingo@...hat.com, linux-kernel@...r.kernel.org,
peterz@...radead.org, will@...nel.org, tglx@...utronix.de,
rostedt@...dmis.org, joel@...lfernandes.org, sashal@...nel.org,
daniel.vetter@...ll.ch, chris@...is-wilson.co.uk,
duyuyang@...il.com, johannes.berg@...el.com, willy@...radead.org,
david@...morbit.com, amir73il@...il.com,
gregkh@...uxfoundation.org, kernel-team@....com,
linux-mm@...ck.org, akpm@...ux-foundation.org, mhocko@...nel.org,
minchan@...nel.org, hannes@...xchg.org, vdavydov.dev@...il.com,
sj@...nel.org, jglisse@...hat.com, dennis@...nel.org, cl@...ux.com,
penberg@...nel.org, rientjes@...gle.com, vbabka@...e.cz,
ngupta@...are.org, linux-block@...r.kernel.org,
paolo.valente@...aro.org, josef@...icpanda.com,
linux-fsdevel@...r.kernel.org, viro@...iv.linux.org.uk,
jack@...e.cz, jack@...e.com, jlayton@...nel.org,
dan.j.williams@...el.com, hch@...radead.org, djwong@...nel.org,
dri-devel@...ts.freedesktop.org, rodrigosiqueiramelo@...il.com,
melissa.srw@...il.com, hamohammed.sa@...il.com,
42.hyeyoo@...il.com, mcgrof@...nel.org, holt@....com
Subject: Re: [REPORT] syscall reboot + umh + firmware fallback
On Thu, May 12, 2022 at 08:18:24PM +0900, Byungchul Park wrote:
> I have a question about this one. Yes, it would never been stuck thanks
> to timeout. However, IIUC, timeouts are not supposed to expire in normal
> cases. So I thought a timeout expiration means not a normal case so need
> to inform it in terms of dependency so as to prevent further expiraton.
> That's why I have been trying to track even timeout'ed APIs.
As I beleive I've already pointed out to you previously in ext4 and
ocfs2, the jbd2 timeout every five seconds happens **all** the time
while the file system is mounted. Commits more frequently than five
seconds is the exception case, at least for desktops/laptop workloads.
We *don't* get to the timeout only when a userspace process calls
fsync(2), or if the journal was incorrectly sized by the system
administrator so that it's too small, and the workload has so many
file system mutations that we have to prematurely close the
transaction ahead of the 5 second timeout.
> Do you think DEPT shouldn't track timeout APIs? If I was wrong, I
> shouldn't track the timeout APIs any more.
DEPT tracking timeouts will cause false positives in at least some
cases. At the very least, there needs to be an easy way to suppress
these false positives on a per wait/mutex/spinlock basis.
- Ted
Powered by blists - more mailing lists