lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <202cde0e0908200003w43b91ac3v8a149ec1ace45d6d@mail.gmail.com>
Date:	Thu, 20 Aug 2009 19:03:28 +1200
From:	Alexey Korolev <akorolex@...il.com>
To:	Mel Gorman <mel@....ul.ie>
Cc:	Eric Munson <linux-mm@...bm.net>,
	Alexey Korolev <akorolev@...radead.org>, linux-mm@...ck.org,
	linux-kernel@...r.kernel.org
Subject: Re: [PATCH 0/3]HTLB mapping for drivers (take 2)

Mel,

>> User level applications process the data.
>> Device is using a master DMA to send data to the user buffer, buffer
>> size can be >1GB and performance is very important. (So huge pages
>> mapping really makes sense.)
>>
>
> Ok, so the DMA may be faster because you have to do less scatter/gather
> and can DMA in larger chunks and and reading from userspace may be faster
> because there is less translation overhead. Right?
>
Less translation overhead is important. Unfortunately not all devices
have scatter/gather
(our case) as having it increase h/w complexity a lot.

>> In addition we have to mention that:
>> 1. It is hard for user to tell how much huge pages needs to be
>>    reserved by the driver.
>
> I think you have this problem either way. If the buffer is allocated and
> populated before mmap(), then the driver is going to have to guess how many
> pages it needs. If the DMA occurs as a result of mmap(), it's easier because
> you know the number of huge pages to be reserved at that point and you have
> the option of falling back to small pages if necessary.
>
>> 2. Devices add constrains on memory regions. For example it needs to
>>    be contiguous with in the physical address space. It is necessary to
>>   have ability to specify special gfp flags.
>
> The contiguity constraints are the same for huge pages. Do you mean there
> are zone restrictions? If so, the hugetlbfs_file_setup() function could be
> extended to specify a GFP mask that is used for the allocation of hugepages
> and associated with the hugetlbfs inode. Right now, there is a htlb_alloc_mask
> mask that is applied to some additional flags so htlb_alloc_mask would be
> the default mask unless otherwise specified.
>
Under contiguous I mean that we need several huge pages being
physically contiguous.
To obtain it we allocate pages till not find a contig. region
(success) or reach a boundary (fail).
So in our particular case approach based on getting pages from
hugetlbfs won't work
because memory region will not be contiguous.
However this approach will give an easy way to support hugetlb
mapping, it won't cause any complexity
in accounting. But it will be suitable for hardware with large amount
of sg regions only.

>
> How about;
>
>        o Extend Eric's helper slightly to take a GFP mask that is
>          associated with the inode and used for allocations from
>          outside the hugepage pool
>        o A helper that returns the page at a given offset within
>          a hugetlbfs file for population before the page has been
>          faulted.
Do you mean get_user_pages call?

Alexey
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ