lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Thu, 24 Feb 2022 11:22:39 +0100
From:   Jan Kara <jack@...e.cz>
To:     Byungchul Park <byungchul.park@....com>
Cc:     Jan Kara <jack@...e.cz>, torvalds@...ux-foundation.org,
        damien.lemoal@...nsource.wdc.com, linux-ide@...r.kernel.org,
        adilger.kernel@...ger.ca, linux-ext4@...r.kernel.org,
        mingo@...hat.com, linux-kernel@...r.kernel.org,
        peterz@...radead.org, will@...nel.org, tglx@...utronix.de,
        rostedt@...dmis.org, joel@...lfernandes.org, sashal@...nel.org,
        daniel.vetter@...ll.ch, chris@...is-wilson.co.uk,
        duyuyang@...il.com, johannes.berg@...el.com, tj@...nel.org,
        tytso@....edu, willy@...radead.org, david@...morbit.com,
        amir73il@...il.com, bfields@...ldses.org,
        gregkh@...uxfoundation.org, kernel-team@....com,
        linux-mm@...ck.org, akpm@...ux-foundation.org, mhocko@...nel.org,
        minchan@...nel.org, hannes@...xchg.org, vdavydov.dev@...il.com,
        sj@...nel.org, jglisse@...hat.com, dennis@...nel.org, cl@...ux.com,
        penberg@...nel.org, rientjes@...gle.com, vbabka@...e.cz,
        ngupta@...are.org, linux-block@...r.kernel.org, axboe@...nel.dk,
        paolo.valente@...aro.org, josef@...icpanda.com,
        linux-fsdevel@...r.kernel.org, viro@...iv.linux.org.uk,
        jack@...e.com, jlayton@...nel.org, dan.j.williams@...el.com,
        hch@...radead.org, djwong@...nel.org,
        dri-devel@...ts.freedesktop.org, airlied@...ux.ie,
        rodrigosiqueiramelo@...il.com, melissa.srw@...il.com,
        hamohammed.sa@...il.com
Subject: Re: Report 2 in ext4 and journal based on v5.17-rc1

On Thu 24-02-22 10:11:02, Byungchul Park wrote:
> On Wed, Feb 23, 2022 at 03:48:59PM +0100, Jan Kara wrote:
> > > KJOURNALD2(kthread)	TASK1(ksys_write)	TASK2(ksys_write)
> > > 
> > > wait A
> > > --- stuck
> > > 			wait B
> > > 			--- stuck
> > > 						wait C
> > > 						--- stuck
> > > 
> > > wake up B		wake up C		wake up A
> > > 
> > > where:
> > > A is a wait_queue, j_wait_commit
> > > B is a wait_queue, j_wait_transaction_locked
> > > C is a rwsem, mapping.invalidate_lock
> > 
> > I see. But a situation like this is not necessarily a guarantee of a
> > deadlock, is it? I mean there can be task D that will eventually call say
> > 'wake up B' and unblock everything and this is how things were designed to
> > work? Multiple sources of wakeups are quite common I'd say... What does
> 
> Yes. At the very beginning when I desgined Dept, I was thinking whether
> to support multiple wakeup sources or not for a quite long time.
> Supporting it would be a better option to aovid non-critical reports.
> However, I thought anyway we'd better fix it - not urgent tho - if
> there's any single circle dependency. That's why I decided not to
> support it for now and wanted to gather the kernel guys' opinions. Thing
> is which policy we should go with.

I see. So supporting only a single wakeup source is fine for locks I guess.
But for general wait queues or other synchronization mechanisms, I'm afraid
it will lead to quite some false positive reports. Just my 2c.

> > Dept do to prevent false reports in cases like this?
> > 
> > > The above is the simplest form. And it's worth noting that Dept focuses
> > > on wait and event itself rather than grabing and releasing things like
> > > lock. The following is the more descriptive form of it.
> > > 
> > > KJOURNALD2(kthread)	TASK1(ksys_write)	TASK2(ksys_write)
> > > 
> > > wait @j_wait_commit
> > > 			ext4_truncate_failed_write()
> > > 			   down_write(mapping.invalidate_lock)
> > > 
> > > 			   ext4_truncate()
> > > 			      ...
> > > 			      wait @j_wait_transaction_locked
> > > 
> > > 						ext_truncate_failed_write()
> > > 						   down_write(mapping.invalidate_lock)
> > > 
> > > 						ext4_should_retry_alloc()
> > > 						   ...
> > > 						   __jbd2_log_start_commit()
> > > 						      wake_up(j_wait_commit)
> > > jbd2_journal_commit_transaction()
> > >    wake_up(j_wait_transaction_locked)
> > > 			   up_write(mapping.invalidate_lock)
> > > 
> > > I hope this would help you understand the report.
> > 
> > I see, thanks for explanation! So the above scenario is impossible because
> 
> My pleasure.
> 
> > for anyone to block on @j_wait_transaction_locked the transaction must be
> > committing, which is done only by kjournald2 kthread and so that thread
> > cannot be waiting at @j_wait_commit. Essentially blocking on
> > @j_wait_transaction_locked means @j_wait_commit wakeup was already done.
> 
> kjournal2 repeatedly does the wait and the wake_up so the above scenario
> looks possible to me even based on what you explained. Maybe I should
> understand how the journal things work more for furhter discussion. Your
> explanation is so helpful. Thank you really.

OK, let me provide you with more details for better understanding :) In
jbd2 we have an object called 'transaction'. This object can go through
many states but for our case is important that transaction is moved to
T_LOCKED state and out of it only while jbd2_journal_commit_transaction()
function is executing and waiting on j_wait_transaction_locked waitqueue is
exactly waiting for a transaction to get out of T_LOCKED state. Function
jbd2_journal_commit_transaction() is executed only by kjournald. Hence
anyone can see transaction in T_LOCKED state only if kjournald is running
inside jbd2_journal_commit_transaction() and thus kjournald cannot be
sleeping on j_wait_commit at the same time. Does this explain things?

								Honza
-- 
Jan Kara <jack@...e.com>
SUSE Labs, CR

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ