linux-kernel - Re: Report 2 in ext4 and journal based on v5.17-rc1

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20220303013635.GC20752@X58A-UD3R>
Date:   Thu, 3 Mar 2022 10:36:35 +0900
From:   Byungchul Park <byungchul.park@....com>
To:     Theodore Ts'o <tytso@....edu>
Cc:     Jan Kara <jack@...e.cz>, torvalds@...ux-foundation.org,
        damien.lemoal@...nsource.wdc.com, linux-ide@...r.kernel.org,
        adilger.kernel@...ger.ca, linux-ext4@...r.kernel.org,
        mingo@...hat.com, linux-kernel@...r.kernel.org,
        peterz@...radead.org, will@...nel.org, tglx@...utronix.de,
        rostedt@...dmis.org, joel@...lfernandes.org, sashal@...nel.org,
        daniel.vetter@...ll.ch, chris@...is-wilson.co.uk,
        duyuyang@...il.com, johannes.berg@...el.com, tj@...nel.org,
        willy@...radead.org, david@...morbit.com, amir73il@...il.com,
        bfields@...ldses.org, gregkh@...uxfoundation.org,
        kernel-team@....com, linux-mm@...ck.org, akpm@...ux-foundation.org,
        mhocko@...nel.org, minchan@...nel.org, hannes@...xchg.org,
        vdavydov.dev@...il.com, sj@...nel.org, jglisse@...hat.com,
        dennis@...nel.org, cl@...ux.com, penberg@...nel.org,
        rientjes@...gle.com, vbabka@...e.cz, ngupta@...are.org,
        linux-block@...r.kernel.org, axboe@...nel.dk,
        paolo.valente@...aro.org, josef@...icpanda.com,
        linux-fsdevel@...r.kernel.org, viro@...iv.linux.org.uk,
        jack@...e.com, jlayton@...nel.org, dan.j.williams@...el.com,
        hch@...radead.org, djwong@...nel.org,
        dri-devel@...ts.freedesktop.org, airlied@...ux.ie,
        rodrigosiqueiramelo@...il.com, melissa.srw@...il.com,
        hamohammed.sa@...il.com
Subject: Re: Report 2 in ext4 and journal based on v5.17-rc1

On Mon, Feb 28, 2022 at 04:25:04PM -0500, Theodore Ts'o wrote:
> On Mon, Feb 28, 2022 at 11:14:44AM +0100, Jan Kara wrote:
> > > case 1. Code with an actual circular dependency, but not deadlock.
> > > 
> > >    A circular dependency can be broken by a rescue wakeup source e.g.
> > >    timeout. It's not a deadlock. If it's okay that the contexts
> > >    participating in the circular dependency and others waiting for the
> > >    events in the circle are stuck until it gets broken. Otherwise, say,
> > >    if it's not meant, then it's anyway problematic.
> > > 
> > >    1-1. What if we judge this code is problematic?
> > >    1-2. What if we judge this code is good?
> > > 
> > > I've been wondering if the kernel guys esp. Linus considers code with
> > > any circular dependency is problematic or not, even if it won't lead to
> > > a deadlock, say, case 1. Even though I designed Dept based on what I
> > > believe is right, of course, I'm willing to change the design according
> > > to the majority opinion.
> > > 
> > > However, I would never allow case 1 if I were the owner of the kernel
> > > for better stability, even though the code works anyway okay for now.
> 
> Note, I used the example of the timeout as the most obvious way of
> explaining that a deadlock is not possible.  There is also the much
> more complex explanation which Jan was trying to give, which is what
> leads to the circular dependency.  It can happen that when trying to
> start a handle, if either (a) there is not enough space in the journal
> for new handles, or (b) the current transaction is so large that if we
> don't close the transaction and start a new hone, we will end up
> running out of space in the future, and so in that case,
> start_this_handle() will block starting any more handles, and then
> wake up the commit thread.  The commit thread then waits for the
> currently running threads to complete, before it allows new handles to
> start, and then it will complete the commit.  In the case of (a) we
> then need to do a journal checkpoint, which is more work to release
> space in the journal, and only then, can we allow new handles to start.

Thank you for the full explanation of how journal things work.

> The botom line is (a) it works, (b) there aren't significant delays,
> and for DEPT to complain that this is somehow wrong and we need to
> completely rearchitect perfectly working code because it doesn't
> confirm to DEPT's idea of what is "correct" is not acceptable.

Thanks to you and Jan Kara, I realized it's not a real dependency in the
consumer and producer scenario but again *ONLY IF* there is a rescue
wakeup source. Dept should track the rescue wakeup source instead in the
case.

I won't ask you to rearchitect the working code. The code looks sane.

Thanks a lot.

Thanks,
Byungchul

> > We have a queue of work to do Q protected by lock L. Consumer process has
> > code like:
> > 
> > while (1) {
> > 	lock L
> > 	prepare_to_wait(work_queued);
> > 	if (no work) {
> > 		unlock L
> > 		sleep
> > 	} else {
> > 		unlock L
> > 		do work
> > 		wake_up(work_done)
> > 	}
> > }
> > 
> > AFAIU Dept will create dependency here that 'wakeup work_done' is after
> > 'wait for work_queued'. Producer has code like:
> > 
> > while (1) {
> > 	lock L
> > 	prepare_to_wait(work_done)
> > 	if (too much work queued) {
> > 		unlock L
> > 		sleep
> > 	} else {
> > 		queue work
> > 		unlock L
> > 		wake_up(work_queued)
> > 	}
> > }
> > 
> > And Dept will create dependency here that 'wakeup work_queued' is after
> > 'wait for work_done'. And thus we have a trivial cycle in the dependencies
> > despite the code being perfectly valid and safe.
> 
> Cheers,
> 
> 							- Ted