lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CALCETrVzD8oPv=h2q91AMdCHn3S782GmvsY-+mwoaPUw=5N7HQ@mail.gmail.com>
Date:   Mon, 07 May 2018 04:51:57 +0000
From:   Andy Lutomirski <luto@...capital.net>
To:     "Kirill A. Shutemov" <kirill.shutemov@...ux.intel.com>
Cc:     Andrew Morton <akpm@...ux-foundation.org>,
        Michal Hocko <mhocko@...nel.org>,
        Linus Torvalds <torvalds@...ux-foundation.org>,
        X86 ML <x86@...nel.org>, Linux-MM <linux-mm@...ck.org>,
        LKML <linux-kernel@...r.kernel.org>
Subject: Re: Proof-of-concept: better(?) page-table manipulation API

On Tue, Apr 24, 2018 at 8:44 AM Kirill A. Shutemov <
kirill.shutemov@...ux.intel.com> wrote:

> Hi everybody,

> I've proposed to talk about page able manipulation API on the LSF/MM'2018,
> so I need something material to talk about.


I gave it a quick read.  I like the concept a lot, and I have a few
comments.

> +/*
> + * How manu bottom level we account to mm->pgtables_bytes
> + */
> +#define PT_ACCOUNT_LVLS 3
> +
> +struct pt_ptr {
> +       unsigned long *ptr;
> +       int lvl;
> +};
> +

I think you've inherited something that I consider to be a defect in the
old code: you're conflating page *tables* with page table *entries*.  Your
'struct pt_ptr' sounds like a pointer to an entire page table, but AFAICT
you're using it to point to a specific entry within a table.  I think that
both the new core code and the code that uses it would be clearer and less
error prone if you made the distinction explicit.  I can think of two clean
ways to do it:

1. Add a struct pt_entry_ptr, and make it so that get_ptv(), etc take a
pt_entry_ptr instead of a pt_ptr.  Add a helper to find a pt_entry_ptr
given a pt_ptr and either an index or an address.

2. Don't allow pointers to page table entries at all.  Instead, get_ptv()
would take an address or an index parameter.

Also, what does lvl == 0 mean?  Is it the top or the bottom?  I think a
comment would be helpful.

> +/*
> + * When walking page tables, get the address of the next boundary,
> + * or the end address of the range if that comes earlier.  Although no
> + * vma end wraps to 0, rounded up __boundary may wrap to 0 throughout.
> + */

I read this comment twice, and I still don't get it.  Can you clarify what
this function does and why you would use it?

> +/* Operations on page table pointers */
> +
> +/* Initialize ptp_t with pointer to top page table level. */
> +static inline ptp_t ptp_init(struct mm_struct *mm)
> +{
> +       struct pt_ptr ptp ={
> +               .ptr = (unsigned long *)mm->pgd,
> +               .lvl = PT_TOP_LEVEL,
> +       };
> +
> +       return ptp;
> +}
> +

On some architectures, there are multiple page table roots.  For example,
ARM64 has a root for the kernel half of the address space and a root for
the user half (at least -- I don't fully understand it).  x86 PAE sort-of
has four roots.  Would it make sense to expose this in the API for real?
For example, ptp_init(mm) could be replaced with ptp_init(mm, addr).  This
would make it a bit cleaner to handle an separate user and kernel tables.
  (As it stands, what is supposed to happen on ARM if you do
ptp_init(something that isn't init_mm) and then walk it to look for a
kernel address?)

Also, ptp_init() seems oddly named for me.  ptp_get_root_for_mm(),
perhaps?  There could also be ptp_get_kernel_root() to get the root for the
init_mm's tables.

> +static inline void ptp_walk(ptp_t *ptp, unsigned long addr)
> +{
> +       ptp->ptr = (unsigned long *)ptp_page_vaddr(ptp);
> +       ptp->ptr += __pt_index(addr, --ptp->lvl);
> +}

Can you add a comment that says what this function does?  Why does it not
change the level?

> +
> +static void ptp_free(struct mm_struct *mm, ptv_t ptv)
> +{
> +       if (ptv.lvl < PT_SPLIT_LOCK_LVLS)
> +               ptlock_free(pfn_to_page(ptv_pfn(ptv)));
> +}
> +

As it stands, this is a function that seems easy easy to misuse given the
confusion between page tables and page table entries.


Finally, a general comment.  Actually fully implementing this the way
you've done it seems like a giant mess given that you need to support all
architectures.  But couldn't you implement the new API as a wrapper around
the old API so you automatically get all architectures?

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ