lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20100305154552.GA6000@thunk.org>
Date:	Fri, 5 Mar 2010 10:45:52 -0500
From:	tytso@....edu
To:	Enrik Berkhan <Enrik.Berkhan@...com>
Cc:	linux-ext4@...r.kernel.org
Subject: Re: possible ext4 related deadlock

On Fri, Mar 05, 2010 at 02:56:28PM +0100, Enrik Berkhan wrote:
> 
> Meanwhile, I have found out that thread 2 actually isn't completely
> blocked but loops in __alloc_pages_internal:
> 
> get_page_from_freelist() doesn't return a page;
> try_to_free_pages() returns did_some_progress == 0;
> later, do_retry == 1 and the loop restarts with goto rebalance;
>
> 
> Can anybody explain this behaviour and maybe direct me to the root cause?
> 
> Of course, this now looks more like a page allocation problem than
> an ext4 one.

Yep, I'd have to agree with you.  We're only trying to allocate a
single page here, and you have plenty of pages available.  Just
checking....  you don't have CONFIG_NUMA enabled and doing something
crazy with NUMA nodes, are you?

The reason why there is no retry logic in this piece of code is this
is not something that we've *ever* anticipated would fail --- and if
it fails, the system is so badly in the weeds that we're probably
better off just returning ENOMEM and return an error to userspace; if
you can't even allocate a single page, the OOM killer should have been
killing off random processes for a while anyway, so a ENOMEM return to
a write(2) system call means the process has gotten off lightly; it's
lucky to still be alive, after all, with the OOM killer surely going
postal if things are so bad that a 4k page allocation isn't
succeeding.  :-)

So I would definitely say this is a problem with the page allocation
mechanism; you say it's been modified a bit for NOMMU for the Blackfin
architecture?  I imagine the fact that you don't have any kind of VM
means that the allocator must be playing all sorts of tricks to make
sure that a process can allocate memory contiguously in its data
segment, for example.  I assume that there are regions of memory which
are reserved for page cache?  I'm guessing you have to change how much
room is reserved for the page cache, or some such.  This sounds like
it's a Blackfin-specific issue.

My recommendation would be that you don't put the retry loop in the
ext4 code, since I suspect this is not going to be the only place
where the Blackfin is going to randomly fail to give you a 4k page,
even though there's plenty of memory somewhere else.  The lack of an
MMU pretty much guarantees this sort of thing can happen.  So putting
the retry loop in the page allocator, with a WARN_ON(1) when it
happens, so the developers can appropriately tune the manual memory
partition settings, seems like the better approach.

Regards,

					- Ted
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ