lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Thu, 3 Mar 2022 10:00:33 +0900
From:   Byungchul Park <byungchul.park@....com>
To:     Jan Kara <jack@...e.cz>
Cc:     torvalds@...ux-foundation.org, damien.lemoal@...nsource.wdc.com,
        linux-ide@...r.kernel.org, adilger.kernel@...ger.ca,
        linux-ext4@...r.kernel.org, mingo@...hat.com,
        linux-kernel@...r.kernel.org, peterz@...radead.org,
        will@...nel.org, tglx@...utronix.de, rostedt@...dmis.org,
        joel@...lfernandes.org, sashal@...nel.org, daniel.vetter@...ll.ch,
        chris@...is-wilson.co.uk, duyuyang@...il.com,
        johannes.berg@...el.com, tj@...nel.org, tytso@....edu,
        willy@...radead.org, david@...morbit.com, amir73il@...il.com,
        bfields@...ldses.org, gregkh@...uxfoundation.org,
        kernel-team@....com, linux-mm@...ck.org, akpm@...ux-foundation.org,
        mhocko@...nel.org, minchan@...nel.org, hannes@...xchg.org,
        vdavydov.dev@...il.com, sj@...nel.org, jglisse@...hat.com,
        dennis@...nel.org, cl@...ux.com, penberg@...nel.org,
        rientjes@...gle.com, vbabka@...e.cz, ngupta@...are.org,
        linux-block@...r.kernel.org, axboe@...nel.dk,
        paolo.valente@...aro.org, josef@...icpanda.com,
        linux-fsdevel@...r.kernel.org, viro@...iv.linux.org.uk,
        jack@...e.com, jlayton@...nel.org, dan.j.williams@...el.com,
        hch@...radead.org, djwong@...nel.org,
        dri-devel@...ts.freedesktop.org, airlied@...ux.ie,
        rodrigosiqueiramelo@...il.com, melissa.srw@...il.com,
        hamohammed.sa@...il.com
Subject: Re: Report 2 in ext4 and journal based on v5.17-rc1

On Mon, Feb 28, 2022 at 11:14:44AM +0100, Jan Kara wrote:
> On Mon 28-02-22 18:28:26, Byungchul Park wrote:
> > case 1. Code with an actual circular dependency, but not deadlock.
> > 
> >    A circular dependency can be broken by a rescue wakeup source e.g.
> >    timeout. It's not a deadlock. If it's okay that the contexts
> >    participating in the circular dependency and others waiting for the
> >    events in the circle are stuck until it gets broken. Otherwise, say,
> >    if it's not meant, then it's anyway problematic.
> > 
> >    1-1. What if we judge this code is problematic?
> >    1-2. What if we judge this code is good?
> > 
> > case 2. Code with an actual circular dependency, and deadlock.
> > 
> >    There's no other wakeup source than those within the circular
> >    dependency. Literally deadlock. It's problematic and critical.
> > 
> >    2-1. What if we judge this code is problematic?
> >    2-2. What if we judge this code is good?
> > 
> > case 3. Code with no actual circular dependency, and not deadlock.
> > 
> >    Must be good.
> > 
> >    3-1. What if we judge this code is problematic?
> >    3-2. What if we judge this code is good?
> > 
> > ---
> > 
> > I call only 3-1 "false positive" circular dependency. And you call 1-1
> > and 3-1 "false positive" deadlock.
> > 
> > I've been wondering if the kernel guys esp. Linus considers code with
> > any circular dependency is problematic or not, even if it won't lead to
> > a deadlock, say, case 1. Even though I designed Dept based on what I
> > believe is right, of course, I'm willing to change the design according
> > to the majority opinion.
> > 
> > However, I would never allow case 1 if I were the owner of the kernel
> > for better stability, even though the code works anyway okay for now.
> 
> So yes, I call a report for the situation "There is circular dependency but
> deadlock is not possible." a false positive. And that is because in my
> opinion your definition of circular dependency includes schemes that are
> useful and used in the kernel.
> 
> Your example in case 1 is kind of borderline (I personally would consider
> that bug as well) but there are other more valid schemes with multiple
> wakeup sources like:
> 
> We have a queue of work to do Q protected by lock L. Consumer process has
> code like:
> 
> while (1) {
> 	lock L
> 	prepare_to_wait(work_queued);
> 	if (no work) {
> 		unlock L
> 		sleep
> 	} else {
> 		unlock L
> 		do work
> 		wake_up(work_done)
> 	}
> }
> 
> AFAIU Dept will create dependency here that 'wakeup work_done' is after
> 'wait for work_queued'. Producer has code like:

First of all, thank you for this good example.

> while (1) {
> 	lock L
> 	prepare_to_wait(work_done)
> 	if (too much work queued) {
> 		unlock L
> 		sleep
> 	} else {
> 		queue work
> 		unlock L
> 		wake_up(work_queued)
> 	}
> }
> 
> And Dept will create dependency here that 'wakeup work_queued' is after
> 'wait for work_done'. And thus we have a trivial cycle in the dependencies
> despite the code being perfectly valid and safe.

Unfortunately, it's neither perfect nor safe without another wakeup
source - rescue wakeup source.

   consumer			producer

				lock L
				(too much work queued == true)
				unlock L
				--- preempted
   lock L
   unlock L
   do work
   lock L
   unlock L
   do work
   ...
   (no work == true)
   sleep
				--- scheduled in
				sleep

This code leads a deadlock without another wakeup source, say, not safe.

But yes. I also think this code should be allowed if it anyway runs
alongside another wakeup source. For the case, Dept should track the
rescue wakeup source instead that leads a actual deadlock.

I will correct code to make Dept track its rescue wakeup source whenever
finding the case.

Lastly, just for your information, I need to explain how Dept works a
little more for you not to misunderstand Dept.

Assuming the consumer and producer guarantee not to lead a deadlock like
the following, Dept won't report it a problem:

   consumer			producer

				sleep
   wakeup work_done
				queue work
   sleep
				wakeup work_queued
   do work
				sleep
   wakeup work_done
				queue work
   sleep
				wakeup work_queued
   do work
				sleep
   ...				...

Dept does not consider all waits preceeding an event but only waits that
might lead a deadlock. In this case, Dept works with each region
independently.

   consumer			producer

				sleep <- initiates region 1
   --- region 1 starts
   ...				...
   --- region 1 ends
   wakeup work_done
   ...				...
				queue work
   ...				...
   sleep <- initiates region 2
				--- region 2 starts
   ...				...
				--- region 2 ends
				wakeup work_queued
   ...				...
   do work
   ...				...
				sleep <- initiates region 3
   --- region 3 starts
   ...				...
   --- region 3 ends
   wakeup work_done
   ...				...
				queue work
   ...				...
   sleep <- initiates region 4
				--- region 4 starts
   ...				...
				--- region 4 ends
				wakeup work_queued
   ...				...
   do work
   ...				...

That is, Dept does not build dependencies across different regions. So
you don't have to worry about unreasonable false positives that much.

Thoughts?

Thanks,
Byungchul

> 								Honza
> -- 
> Jan Kara <jack@...e.com>
> SUSE Labs, CR

Powered by blists - more mailing lists