lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Thu, 27 Jul 2006 15:31:17 +0100
From:	Anton Altaparmakov <aia21@....ac.uk>
To:	Ingo Molnar <mingo@...e.hu>
Cc:	Andrew Morton <akpm@...l.org>, nickpiggin@...oo.com.au,
	eike-kernel@...tec.de, linux-kernel@...r.kernel.org,
	aia21@...tab.net, Arjan van de Ven <arjan@...radead.org>
Subject: Re: [BUG?] possible recursive locking detected

On Thu, 2006-07-27 at 11:46 +0200, Ingo Molnar wrote:
> * Anton Altaparmakov <aia21@....ac.uk> wrote:
> 
> > An example is the potential deadlock in generic buffered file write 
> > where we fault in a page via fault_in_pages_readable() but there is 
> > nothing to guarantee that page will not go away between us doing this 
> > and us using the page.
> 
> isnt this solved by:
> 
>  commit 6527c2bdf1f833cc18e8f42bd97973d583e4aa83
>  Author: Vladimir V. Saveliev <vs@...esys.com>
>  Date:   Tue Jun 27 02:53:57 2006 -0700
> 
>      [PATCH] generic_file_buffered_write(): deadlock on vectored write
> 
> ?
> 
> if not, do you have any description of the problem or a link to previous 
> discussion[s] outlining the problem? To me it appears this is a kernel 
> bug where we simply dropped the ball to fix it. I personally dont find 
> it acceptable to have deadlocks in the kernel, where all that is needed 
> to trigger it is "high i/o loads", no matter how hard it is to fix the 
> deadlock.

For reiserfs?  Certainly not given it doesn't use
generic_file_buffered_write() and instead does the most useless and
stupid thing known to mankind by causing the deadlock even more
effectively (it grabs and locks all the pages and _then_ calls
fault_in_pages_readable() afterwards!)...  In fact the way we stabilize
reiserfs is to make it use generic_file_write() which doesn't do such
stupidities...

Note that even the above patch is not a 100% solution.  What guarantees
are there that the page faulted in will still be around when it is read
a few lines down the line in the code?  Given sufficient parallel memory
pressure/io pressure it can still cause the page to be evicted again
immediately after it is faulted in...

All the above patch does is to _dramatically_ reduce the race window for
this happening but it does not eliminate it in theory (AFAICS).

So if your stance is that deadlocks are completely unacceptable it still
is not fixed.  If your stance is that _really_ unlikely deadlocks are
acceptable then it is fixed.

The heavily loaded servers here certainly do not suffer from the
deadlock any more (well they haven't done for a while anyway no
guarantees it won't happen tomorrow).

Best regards,

        Anton
-- 
Anton Altaparmakov <aia21 at cam.ac.uk> (replace at with @)
Unix Support, Computing Service, University of Cambridge, CB2 3QH, UK
Linux NTFS maintainer / IRC: #ntfs on irc.freenode.net
WWW: http://www.linux-ntfs.org/ & http://www-stu.christs.cam.ac.uk/~aia21/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ