lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Wed, 13 May 2015 16:10:54 +0200
From:	Vlastimil Babka <vbabka@...e.cz>
To:	Benjamin Herrenschmidt <benh@...nel.crashing.org>,
	Christoph Lameter <cl@...ux.com>
CC:	Rik van Riel <riel@...hat.com>, Jerome Glisse <j.glisse@...il.com>,
	"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>,
	linux-kernel@...r.kernel.org, linux-mm@...ck.org,
	jglisse@...hat.com, mgorman@...e.de, aarcange@...hat.com,
	airlied@...hat.com, aneesh.kumar@...ux.vnet.ibm.com,
	Cameron Buschardt <cabuschardt@...dia.com>,
	Mark Hairgrove <mhairgrove@...dia.com>,
	Geoffrey Gerfin <ggerfin@...dia.com>,
	John McKenna <jmckenna@...dia.com>, akpm@...ux-foundation.org
Subject: Re: Interacting with coherent memory on external devices

Sorry for reviving oldish thread...

On 04/28/2015 01:54 AM, Benjamin Herrenschmidt wrote:
> On Mon, 2015-04-27 at 11:48 -0500, Christoph Lameter wrote:
>> On Mon, 27 Apr 2015, Rik van Riel wrote:
>>
>>> Why would we want to avoid the sane approach that makes this thing
>>> work with the fewest required changes to core code?
>>
>> Becaus new ZONEs are a pretty invasive change to the memory management and
>> because there are  other ways to handle references to device specific
>> memory.
>
> ZONEs is just one option we put on the table.
>
> I think we can mostly agree on the fundamentals that a good model of
> such a co-processor is a NUMA node, possibly with a higher distance
> than other nodes (but even that can be debated).
>
> That gives us a lot of the basics we need such as struct page, ability
> to use existing migration infrastructure, and is actually a reasonably
> representation at high level as well.
>
> The question is how do we additionally get the random stuff we don't
> care about out of the way. The large distance will not help that much
> under memory pressure for example.
>
> Covering the entire device memory with a CMA goes a long way toward that
> goal. It will avoid your ordinary kernel allocations.

I think ZONE_MOVABLE should be sufficient for this. CMA is basically for 
marking parts of zones as MOVABLE-only. You shouldn't need that for the 
whole zone. Although it might happen that CMA will be a special zone one 
day.

> It also provides just what we need to be able to do large contiguous
> "explicit" allocations for use by workloads that don't want the
> transparent migration and by the driver for the device which might also
> need such special allocations for its own internal management data
> structures.

Plain zone compaction + reclaim should work as well in a ZONE_MOVABLE 
zone. CMA allocations might IIRC additionally migrate across zones, e.g. 
from the device to system memory (unlike plain compaction), which might 
be what you want, or not.

> We still have the risk of pages in the CMA being pinned by something
> like gup however, that's where the ZONE idea comes in, to ensure the
> various kernel allocators will *never* allocate in that zone unless
> explicitly specified, but that could possibly implemented differently.

Kernel allocations should ignore the ZONE_MOVABLE zone as they are not 
typically movable. Then it depends on how much control you want for 
userspace allocations.

> Maybe a concept of "exclusive" NUMA node, where allocations never
> fallback to that node unless explicitly asked to go there.

I guess that could be doable on the zonelist level, where the device 
memory node/zone wouldn't be part of the "normal" zonelists, so memory 
pressure calculations should be also fine. But sure there will be some 
corner cases :)

> Of course that would have an impact on memory pressure calculations,
> nothign comes completely for free, but at this stage, this is the goal
> of this thread, ie, to swap ideas around and see what's most likely to
> work in the long run before we even start implementing something.
>
> Cheers,
> Ben.
>
>
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@...ck.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@...ck.org"> email@...ck.org </a>
>

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ