linux-kernel - Re: Network receive stall avoidance (was [PATCH 2/9] deadlock prevention core)

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Date:	Fri, 18 Aug 2006 21:14:09 -0700
From:	Daniel Phillips <phillips@...gle.com>
To:	Andrew Morton <akpm@...l.org>
CC:	Peter Zijlstra <a.p.zijlstra@...llo.nl>,
	David Miller <davem@...emloft.net>, riel@...hat.com,
	tgraf@...g.ch, linux-mm@...ck.org, linux-kernel@...r.kernel.org,
	netdev@...r.kernel.org, Mike Christie <michaelc@...wisc.edu>
Subject: Re: Network receive stall avoidance (was [PATCH 2/9] deadlock prevention
 core)

Andrew Morton wrote:
>>handwaving
> 
> - The mmap(MAP_SHARED)-the-whole-world scenario should be fixed by
>   mm-tracking-shared-dirty-pages.patch.  Please test it and if you are
>   still able to demonstrate deadlocks, describe how, and why they
>   are occurring.

OK, but please see "atomic 0 order allocation failure" below.

> - We expect that the lots-of-dirty-anon-memory-over-swap-over-network
>   scenario might still cause deadlocks.  
> 
>   I assert that this can be solved by putting swap on local disks.  Peter
>   asserts that this isn't acceptable due to disk unreliability.  I point
>   out that local disk reliability can be increased via MD, all goes quiet.
> 
>   A good exposition which helps us to understand whether and why a
>   significant proportion of the target user base still wishes to do
>   swap-over-network would be useful.

I don't care much about swap-over-network, Peter does.  I care about
reliable remote disks in general, and reliable nbd in particular.

> - Assuming that exposition is convincing, I would ask that you determine
>   at what level of /proc/sys/vm/min_free_kbytes the swap-over-network
>   deadlock is no longer demonstrable.

Earlier, I wrote: "we need to actually fix the out of memory
deadlock/performance bug right now so that remote storage works
properly."  It is actually far more likely that memory reclaim stuffups
will cause TCP to drop block IO packets here and there than that we
will hit some endless retransmit deadlock.  But dropping packets and
thus causing TCP stalls during block writeout is a very bad thing too,
I do not think you could call a system that exhibits such behavior a
particularly reliable one.

So rather than just the word deadlock, let us add "or atomic 0 order
alloc failure during TCP receive" to the challenge.  Fair?

On the client send side, Peter already showed an easily reproducable
deadlock which we avoid by using elevator-noop.  The bug is still
there, it is just under a different rock now.

Regards,

Daniel
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/