lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <Y2THgc9xgnUJg0Io@mit.edu>
Date:   Fri, 4 Nov 2022 04:04:17 -0400
From:   "Theodore Ts'o" <tytso@....edu>
To:     Byungchul Park <byungchul.park@....com>
Cc:     linux-kernel@...r.kernel.org, torvalds@...ux-foundation.org,
        damien.lemoal@...nsource.wdc.com, linux-ide@...r.kernel.org,
        adilger.kernel@...ger.ca, linux-ext4@...r.kernel.org,
        mingo@...hat.com, peterz@...radead.org, will@...nel.org,
        tglx@...utronix.de, rostedt@...dmis.org, joel@...lfernandes.org,
        sashal@...nel.org, daniel.vetter@...ll.ch,
        chris@...is-wilson.co.uk, duyuyang@...il.com,
        johannes.berg@...el.com, tj@...nel.org, willy@...radead.org,
        david@...morbit.com, amir73il@...il.com,
        gregkh@...uxfoundation.org, kernel-team@....com,
        linux-mm@...ck.org, akpm@...ux-foundation.org, mhocko@...nel.org,
        minchan@...nel.org, hannes@...xchg.org, vdavydov.dev@...il.com,
        sj@...nel.org, jglisse@...hat.com, dennis@...nel.org, cl@...ux.com,
        penberg@...nel.org, rientjes@...gle.com, vbabka@...e.cz,
        ngupta@...are.org, linux-block@...r.kernel.org,
        paolo.valente@...aro.org, josef@...icpanda.com,
        linux-fsdevel@...r.kernel.org, viro@...iv.linux.org.uk,
        jack@...e.cz, jlayton@...nel.org, dan.j.williams@...el.com,
        hch@...radead.org, djwong@...nel.org,
        dri-devel@...ts.freedesktop.org, rodrigosiqueiramelo@...il.com,
        melissa.srw@...il.com, hamohammed.sa@...il.com,
        42.hyeyoo@...il.com, chris.p.wilson@...el.com,
        gwan-gyeong.mun@...el.com
Subject: Re: [QUESTION] {start,stop}_this_handle() and lockdep annotations

Note: in the future, I'd recommend looking at the MAINTAINERS to
figure out a smaller list of people to ask this question, instead of
spamming everyone who has ever expressed interest in DEPT.


On Fri, Nov 04, 2022 at 02:56:32PM +0900, Byungchul Park wrote:
> Peterz (commit 34a3d1e8370870 lockdep: annotate journal_start()) and
> the commit message quoted what Andrew Morton said. It was like:
> 
> > Except lockdep doesn't know about journal_start(), which has ranking
> > requirements similar to a semaphore.
> 
> Could anyone tell what the ranking requirements in the journal code
> exactly means and what makes {start,stop}_this_handle() work for the
> requirements?

The comment from include/linux/jbd2.h may be helpful:

#ifdef CONFIG_DEBUG_LOCK_ALLOC
	/**
	 * @j_trans_commit_map:
	 *
	 * Lockdep entity to track transaction commit dependencies. Handles
	 * hold this "lock" for read, when we wait for commit, we acquire the
	 * "lock" for writing. This matches the properties of jbd2 journalling
	 * where the running transaction has to wait for all handles to be
	 * dropped to commit that transaction and also acquiring a handle may
	 * require transaction commit to finish.
	 */
	struct lockdep_map	j_trans_commit_map;
#endif

And the reason why this isn't a problem is because start_this_handle()
can be passed a special handle which is guaranteed to not block
(because we've reserved journal credits for it).  Hence, there is no
risk that in _this_ call path start_this_handle() will block for a
commit:

> <4>[   43.124442 ] stacktrace:
> <4>[   43.124443 ]       start_this_handle+0x557/0x620
> <4>[   43.124445 ]       jbd2_journal_start_reserved+0x4d/0x1b0
                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
> <4>[   43.124448 ]       __ext4_journal_start_reserved+0x6d/0x190
> <4>[   43.124450 ]       ext4_convert_unwritten_io_end_vec+0x22/0xd0
> <4>[   43.124453 ]       ext4_end_io_rsv_work+0xe4/0x190
> <4>[   43.124455 ]       process_one_work+0x301/0x660
> <4>[   43.124458 ]       worker_thread+0x3a/0x3c0
> <4>[   43.124459 ]       kthread+0xef/0x120
> <4>[   43.124462 ]       ret_from_fork+0x22/0x30


The comment for this function from fs/jbd2/transaction.c:

/**
 * jbd2_journal_start_reserved() - start reserved handle
 * @handle: handle to start
 * @type: for handle statistics
 * @line_no: for handle statistics
 *
 * Start handle that has been previously reserved with jbd2_journal_reserve().
 * This attaches @handle to the running transaction (or creates one if there's
 * not transaction running). Unlike jbd2_journal_start() this function cannot
 * block on journal commit, checkpointing, or similar stuff. It can block on
 * memory allocation or frozen journal though.
 *
 * Return 0 on success, non-zero on error - handle is freed in that case.
 */

And this is why this will never be a problem in real life, or flagged
by Lockdep, since Lockdep is a dynamic checker.  The deadlock which
the static DEPT checker has imagined can never, ever, ever happen.

For more context, also from fs/jbd2/transaction.c:

/**
 * jbd2_journal_start() - Obtain a new handle.
 * @journal: Journal to start transaction on.
 * @nblocks: number of block buffer we might modify
 *
 * We make sure that the transaction can guarantee at least nblocks of
 * modified buffers in the log.  We block until the log can guarantee
 * that much space. Additionally, if rsv_blocks > 0, we also create another
 * handle with rsv_blocks reserved blocks in the journal. This handle is
 * stored in h_rsv_handle. It is not attached to any particular transaction
 * and thus doesn't block transaction commit. If the caller uses this reserved
 * handle, it has to set h_rsv_handle to NULL as otherwise jbd2_journal_stop()
 * on the parent handle will dispose the reserved one. Reserved handle has to
 * be converted to a normal handle using jbd2_journal_start_reserved() before
 * it can be used.
 *
 * Return a pointer to a newly allocated handle, or an ERR_PTR() value
 * on failure.
 */

To be clear, I don't blame DEPT for not being able to figure this out;
it would require human-level intelligence to understand why in *this*
call path, we never end up waiting.  But this is why I am very
skeptical of static analyzers which are *sure* that anything they flag
is definitely a bug.  We definitely will need a quick and easy way to
tell DEPT, "go home, you're drunk".

Hope this helps,

					- Ted


P.S.  Note: the actual function names are a bit misleading.  It looks
like the functions got refactored, and the documentation wasn't
updated to match.  Sigh... fortuantely the concepts are accurate; it's
just that function names needs to be fixed up.  For example, creating
a reserved handle is no longer done via jbd2_journal_start(), but
rather jbd2__journal_start().  

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ