lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Wed, 14 Apr 2010 14:15:16 +0200
From:	Andi Kleen <andi@...stfloor.org>
To:	Chris Mason <chris.mason@...cle.com>
Cc:	Mel Gorman <mel@....ul.ie>, Dave Chinner <david@...morbit.com>,
	linux-kernel@...r.kernel.org, linux-mm@...ck.org,
	linux-fsdevel@...r.kernel.org
Subject: Re: [PATCH] mm: disallow direct reclaim page writeback

Chris Mason <chris.mason@...cle.com> writes:
>> 
>> Basically if you cannot tolerate 1K (or more likely more) of stack
>> used before your fs is called you're toast in lots of other situations
>> anyways.
>
> Well, on a 4K stack kernel, 832 bytes is a very large percentage for
> just one function.

To be honest I think 4K stack simply has to go. I tend to call
it "russian roulette" mode. 

It was just a old workaround for a very old buggy VM that couldn't free 8K
pages and the VM is a lot better at that now. And the general trend is
to more complex code everywhere, so 4K stacks become more and more hazardous.

It was a bad idea back then and is still a bad idea, getting
worse and worse with each MLOC being added to the kernel each year.

We don't have any good ways to verify that obscure paths through
the more and more subsystems won't exceed it (in fact I'm pretty
sure there are plenty of problems in exotic configurations)

And even if you can make a specific load work there's basically
no safety net.

The only part of the 4K stack code that's good is the separate
interrupt stack, but that one should be just combined with a sane 8K 
process stack.

But yes on a 4K kernel you probably don't want to do any direct reclaim. 
Maybe for GFP_NOFS everywhere except user allocations when it's set? 
Or simply drop it?

> But they don't realize their function can dive down into ecryptfs then
> the filesystem then maybe loop and then perhaps raid6 on top of a
> network block device.

Those stackings need to use separate threads anyways. A lot of them
do in fact. Block avoided this problem by iterating instead of
recursing.  Those that still recurse on the same stack simply
need to be fixed.

> Yeah, but since the call chain does eventually go into the allocator,
> this function needs to be more stack friendly.

For common fast paths it doesn't go into the allocator.

-Andi

-- 
ak@...ux.intel.com -- Speaking for myself only.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ