lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date:   Thu, 6 Sep 2018 07:35:37 -0700
From:   Dave Hansen <dave.hansen@...el.com>
To:     Fengguang Wu <fengguang.wu@...el.com>,
        Andrew Morton <akpm@...ux-foundation.org>
Cc:     Linux Memory Management List <linux-mm@...ck.org>,
        Peng DongX <dongx.peng@...el.com>,
        Liu Jingqi <jingqi.liu@...el.com>,
        Dong Eddie <eddie.dong@...el.com>,
        Huang Ying <ying.huang@...el.com>,
        Brendan Gregg <bgregg@...flix.com>, kvm@...r.kernel.org,
        LKML <linux-kernel@...r.kernel.org>
Subject: Re: [RFC][PATCH 4/5] [PATCH 4/5] kvm-ept-idle: EPT page table walk
 for A bits

On 09/01/2018 04:28 AM, Fengguang Wu wrote:
> (2) would need fundemental changes to the interface. It seems existing solutions
> for sparse files like SEEK_HOLE/SEEK_DATA and FIEMAP ioctl may not serve this
> situation well. The most efficient way could be to fill user space read()
> buffer with an array of small extents:

I've only been doing kernel development a few short years, but I've
learned that designing user/kernel interfaces is hard.

A comment in an RFC saying that we need "fundamental changes to the
interface" seems to be more of a cry for help than a request for
comment.  This basically says to me: ignore the interface, it's broken.

> This borrows host page table walk macros/functions to do EPT walk.
> So it depends on them using the same level.

Have you actually run this code?

How does this work?  It's walking the 'ept_root' that appears to be a
guest CR3 register value.  It doesn't appear to be the host CR3 value of
the qemu (or other hypervisor).

I'm also woefully confused why you are calling these EPT walks and then
walking the x86-style page tables.  EPT tables don't have the same
format as x86 page tables, plus they don't start at a CR3-provided value.

I'm also rather unsure that when running a VM, *any* host-format page
tables get updated A/D bits.  You need a host vaddr to do a host-format
page table walk in the host page tables, and the EPT tables do direct
guest physical to host physical translation.  There's no host vaddr
involved at all in those translations.

> +		if (!ept_pte_present(*pte) ||
> +		    !ept_pte_accessed(*pte))
> +			idle = 1;

Huh?  So, !Present and idle are treated the same?  If you had large
swaths of !Present memory, you would see that in this interface and say,
"gee, I've got a lot of idle memory to migrate" and then go do a bunch
of calls to migrate it?  That seems terminally wasteful.

Who is going to use this?

Do you have an example, at least dummy app?

Can more than one app read this data at the same time?  Who manages it?
Who owns it?  Are multiple reads destructive?

This entire set is almost entirely comment-free except for the
occasional C++ comments.  That doesn't inspire confidence.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ