linux-kernel - Re: [RFC PATCH] Add /proc/<pid>/numa

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <51145ccb-fc0d-0281-9757-fb8a5112ec24@oracle.com>
Date:   Mon, 7 May 2018 16:22:15 -0700
From:   "prakash.sangappa" <prakash.sangappa@...cle.com>
To:     Dave Hansen <dave.hansen@...el.com>,
        Anshuman Khandual <khandual@...ux.vnet.ibm.com>,
        Andrew Morton <akpm@...ux-foundation.org>
Cc:     linux-kernel@...r.kernel.org, linux-mm@...ck.org,
        linux-api@...r.kernel.org, mhocko@...e.com,
        kirill.shutemov@...ux.intel.com, n-horiguchi@...jp.nec.com,
        drepper@...il.com, rientjes@...gle.com,
        Naoya Horiguchi <nao.horiguchi@...il.com>
Subject: Re: [RFC PATCH] Add /proc/<pid>/numa_vamaps for numa node information

On 05/03/2018 03:26 PM, Dave Hansen wrote:
> On 05/03/2018 03:27 PM, prakash.sangappa wrote:
>> If each consecutive page comes from different node, yes in
>> the extreme case is this file will have a lot of lines. All the lines
>> are generated at the time file is read. The amount of data read will be
>> limited to the user read buffer size used in the read.
>>
>> /proc/<pid>/pagemap also has kind of  similar issue. There is 1 64
>> bit value for each user page.
> But nobody reads it sequentially.  Everybody lseek()s because it has a
> fixed block size.  You can't do that in text.

The current text based files  on /proc does allow seeking, but it will not
help to seek to a specific VA(vma) to start from, as the seek offset 
will be the
offset in the text. This is the case with using 'seq_file' interface in the
kernel to generate the /proc file content.

However, with the proposed new file, we could allow seeking to specified
virtual address. The lseek offset in this case would represent the 
virtual address
of the process. Subsequent read from the file would provide VA range to 
numa node
information starting from that VA. In case the VA seek'ed to is invalid, 
it will start
from the next valid mapped VA of the process. The implementation would
not be based on seq_file.

For example.
Getting numa node information for a process having the following VMAs 
mapped,
starting from '006dc000'

00400000-004dd000
006dc000-006dd000
006dd000-006e6000

Can  seek to VA 006dc000 and start reading, it would get following

006dc000-006dd000 N1=1 kernelpagesize_kB=4 anon=1 dirty=1 
file=/usr/bin/bash
006dd000-006de000 N0=1 kernelpagesize_kB=4 anon=1 dirty=1 
file=/usr/bin/bash
006de000-006e0000 N1=2 kernelpagesize_kB=4 anon=2 dirty=2 
file=/usr/bin/bash
006e0000-006e6000 N0=6 kernelpagesize_kB=4 anon=6 dirty=6 
file=/usr/bin/bash
..

One advantage with getting numa node information from this /proc file vs 
say
using 'move_pages()' API, will be that the /proc file will be able to 
provide address
range to numa node information, not one page at a time.