[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <1233791729.4612.28.camel@pasglop>
Date: Thu, 05 Feb 2009 10:55:29 +1100
From: Benjamin Herrenschmidt <benh@...nel.crashing.org>
To: Roland Dreier <rdreier@...co.com>
Cc: Andrew Morton <akpm@...ux-foundation.org>,
Eli Cohen <eli@...lanox.co.il>, linux-kernel@...r.kernel.org,
linuxppc-dev@...abs.org, Rusty Russell <rusty@...tcorp.com.au>
Subject: Re: FW: [PATCH] powerpc/mm: Export HPAGE_SHIFT
> get_user_pages() also gives us the vma back, and we can see from
> is_vm_hugetlb_page() (-- BTW can I just say that a function
> is_xxx_page() that operates on vmas is horribly misnamed --) that these
> pages all come from a hugetlb mapping, but figuring out the size of that
> mapping is I guess a challenge.
Note that g_u_p() has all sort of shortcommings... we were discussing
some of that recently due to bugs reported from the field.
The problem mostly is that you cannot guarantee that the physical page
will remain mapped to that virtual address in the process. For example,
if your code is part of some library used by an application, and that
application somewhere does a fork/exec (for example, a system() call to
run a shell helper), copy-on-write will hit, and you may end up with
the child process getting the original physical page and the original
process getting the copy...
So your HW will still DMA to a valid page (ie, it's count will have
been incremented) but it's not going to be the one the application
uses any more.
There are similar issues that can be cause, afaik, by madvise, etc...
We've been discussing that at KS with various people, Linus says g_u_p()
sucks, don't do that :-) Most of the time, the other approach should be
used, ie, the driver allocates memory, and userspace mmap's it, in which
case you get access to the VMA to set flags such as don't copy on fork.
An option possibly would be to make fork() pre-COW pages with an
elevated count to ensure that at least the original process is the one
to keep the original physical page... but that has other potential side
effects or performance issues.
A can of worms..
Cheers,
Ben.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists