linux-kernel - Re: Deadlocks with transparent huge pages and userspace fs daemons

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-Id: <E1PSskf-00066t-US@pomaz-ex.szeredi.hu>
Date:	Wed, 15 Dec 2010 15:54:45 +0100
From:	Miklos Szeredi <miklos@...redi.hu>
To:	Andrea Arcangeli <aarcange@...hat.com>
CC:	miklos@...redi.hu, dave@...ux.vnet.ibm.com,
	linux-fsdevel@...r.kernel.org, linux-mm@...ck.org,
	linux-kernel@...r.kernel.org, shenlinf@...ibm.com,
	volobuev@...ibm.com, mel@...ux.vnet.ibm.com, dingc@...ibm.com,
	lnxninja@...ibm.com
Subject: Re: Deadlocks with transparent huge pages and userspace fs daemons

On Wed, 15 Dec 2010, Andrea Arcangeli wrote:
> Hello Miklos and everyone,
> 
> On Tue, Dec 14, 2010 at 10:03:33PM +0100, Miklos Szeredi wrote:
> > This is all fine and dandy, but please let's not forget about the
> > other thing that Dave's test uncovered.  Namely that page migration
> > triggered by transparent hugepages takes the page lock on arbitrary
> > filesystems.  This is also deadlocky on fuse, but also not a good idea
> > for any filesystem where page reading time is not bounded (think NFS
> > with network down).
> 
> In #33 I fixed the mmap_sem write issue which is more clear to me and
> it makes the code better.
> 
> The page lock I don't have full picture on it. Notably there is no
> waiting on page lock on khugepaged and khugepaged can't use page
> migration (it's not migrating, it's collapsing).
> 
> The page lock mentioned in migration context I don't see how can it be
> related to THP. There's not a _single_ lock_page in mm/huge_memory.c .
> 
> If fuse has deadlock troubles in migration lock_page then I would
> guess THP has nothing to do with it memory compaction, and it can
> trigger already in upstream stable 2.6.36 when CONFIG_COMPACTION=y by
> just doing:
> 	
> 	echo 1024 >/sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages
> 
> or by simply insmodding a driver that tries a large
> alloc_pages(order).
> 
> My understanding of Dave's trace is that THP makes it easier to
> reproduce, but this isn't really THP related, it can happen already
> upstream without my patchset applied, and it's just a pure coincidence
> that THP makes it more easy to reproduce.

Right, it's questionable whether any page migration should wait for
I/O as it can introduce large delays, and even complete lockup of an
unrelated process (as in case of NFS server being offline).

The man page for move_pages() clearly defines I/O as an error
condition:

  -EBUSY The page is currently busy and cannot be moved.  Try again
    later.  This occurs if a page is undergoing I/O or another ker-
    nel subsystem is holding a reference to the page.

yet the actual code waits for I/O, both read and write.  That might be
OK with some timeouts.  Page migration is best effort anyway, so
waiting forever on I/O makes little sense.

>  How to fix I'm not sure yet
> as I didn't look into it closely as I was focusing on rolling a THP
> specific update first, but at the moment it even sounds more like an
> issue with strict migration than memory compaction.

Yes, this is a page migration issue.  But the fact is, THP will make
it more visible exactly because it can be used without any special
configuration.

Thanks,
Miklos
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/