lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <a26a71cb-101b-e7a2-9a2f-78995538dbca@oracle.com>
Date:   Fri, 14 Sep 2018 11:04:54 -0700
From:   Prakash Sangappa <prakash.sangappa@...cle.com>
To:     Steven Sistare <steven.sistare@...cle.com>,
        Michal Hocko <mhocko@...nel.org>
Cc:     linux-kernel@...r.kernel.org, linux-mm@...ck.org,
        dave.hansen@...el.com, nao.horiguchi@...il.com,
        akpm@...ux-foundation.org, kirill.shutemov@...ux.intel.com,
        khandual@...ux.vnet.ibm.com
Subject: Re: [PATCH V2 0/6] VA to numa node information



On 9/14/18 9:01 AM, Steven Sistare wrote:
> On 9/14/2018 1:56 AM, Michal Hocko wrote:
>> On Thu 13-09-18 15:32:25, prakash.sangappa wrote:
>>>
>>> The proc interface provides an efficient way to export address range
>>> to numa node id mapping information compared to using the API.
>> Do you have any numbers?
>>
>>> For example, for sparsely populated mappings, if a VMA has large portions
>>> not have any physical pages mapped, the page walk done thru the /proc file
>>> interface can skip over non existent PMDs / ptes. Whereas using the
>>> API the application would have to scan the entire VMA in page size units.
>> What prevents you from pre-filtering by reading /proc/$pid/maps to get
>> ranges of interest?
> That works for skipping holes, but not for skipping huge pages.  I did a
> quick experiment to time move_pages on a 3 GHz Xeon and a 4.18 kernel.
> Allocate 128 GB and touch every small page.  Call move_pages with nodes=NULL
> to get the node id for all pages, passing 512 consecutive small pages per
> call to move_nodes. The total move_nodes time is 1.85 secs, and 55 nsec
> per page.  Extrapolating to a 1 TB range, it would take 15 sec to retrieve
> the numa node for every small page in the range.  That is not terrible, but
> it is not interactive, and it becomes terrible for multiple TB.
>

Also, for valid VMAs in  'maps' file, if the VMA is sparsely populated 
with  physical pages,
the page walk can skip over non existing page table entires (PMDs) and 
so can be faster.

For example  reading va range of a 400GB VMA which has few pages mapped
in beginning and few pages at the end and the rest of VMA does not have 
any pages, it
takes 0.001s using the /proc interface. Whereas with move_page() api 
passing 1024
consecutive small pages address, it takes about 2.4secs. This is on a 
similar system
running 4.19 kernel.



Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ