linux-kernel - Re: [BUG?] possible recursive locking detected

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-Id: <1154010677.21849.66.camel@imp.csi.cam.ac.uk>
Date:	Thu, 27 Jul 2006 15:31:17 +0100
From:	Anton Altaparmakov <aia21@....ac.uk>
To:	Ingo Molnar <mingo@...e.hu>
Cc:	Andrew Morton <akpm@...l.org>, nickpiggin@...oo.com.au,
	eike-kernel@...tec.de, linux-kernel@...r.kernel.org,
	aia21@...tab.net, Arjan van de Ven <arjan@...radead.org>
Subject: Re: [BUG?] possible recursive locking detected

On Thu, 2006-07-27 at 11:46 +0200, Ingo Molnar wrote:
> * Anton Altaparmakov <aia21@....ac.uk> wrote:
> 
> > An example is the potential deadlock in generic buffered file write 
> > where we fault in a page via fault_in_pages_readable() but there is 
> > nothing to guarantee that page will not go away between us doing this 
> > and us using the page.
> 
> isnt this solved by:
> 
>  commit 6527c2bdf1f833cc18e8f42bd97973d583e4aa83
>  Author: Vladimir V. Saveliev <vs@...esys.com>
>  Date:   Tue Jun 27 02:53:57 2006 -0700
> 
>      [PATCH] generic_file_buffered_write(): deadlock on vectored write
> 
> ?
> 
> if not, do you have any description of the problem or a link to previous 
> discussion[s] outlining the problem? To me it appears this is a kernel 
> bug where we simply dropped the ball to fix it. I personally dont find 
> it acceptable to have deadlocks in the kernel, where all that is needed 
> to trigger it is "high i/o loads", no matter how hard it is to fix the 
> deadlock.

For reiserfs?  Certainly not given it doesn't use
generic_file_buffered_write() and instead does the most useless and
stupid thing known to mankind by causing the deadlock even more
effectively (it grabs and locks all the pages and _then_ calls
fault_in_pages_readable() afterwards!)...  In fact the way we stabilize
reiserfs is to make it use generic_file_write() which doesn't do such
stupidities...

Note that even the above patch is not a 100% solution.  What guarantees
are there that the page faulted in will still be around when it is read
a few lines down the line in the code?  Given sufficient parallel memory
pressure/io pressure it can still cause the page to be evicted again
immediately after it is faulted in...

All the above patch does is to _dramatically_ reduce the race window for
this happening but it does not eliminate it in theory (AFAICS).

So if your stance is that deadlocks are completely unacceptable it still
is not fixed.  If your stance is that _really_ unlikely deadlocks are
acceptable then it is fixed.

The heavily loaded servers here certainly do not suffer from the
deadlock any more (well they haven't done for a while anyway no
guarantees it won't happen tomorrow).

Best regards,

        Anton
-- 
Anton Altaparmakov <aia21 at cam.ac.uk> (replace at with @)
Unix Support, Computing Service, University of Cambridge, CB2 3QH, UK
Linux NTFS maintainer / IRC: #ntfs on irc.freenode.net
WWW: http://www.linux-ntfs.org/ & http://www-stu.christs.cam.ac.uk/~aia21/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/