[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20200417151707.GG20730@hirez.programming.kicks-ass.net>
Date: Fri, 17 Apr 2020 17:17:07 +0200
From: Peter Zijlstra <peterz@...radead.org>
To: "Kirill A. Shutemov" <kirill@...temov.name>
Cc: "Kirill A. Shutemov" <kirill.shutemov@...ux.intel.com>,
Dave Hansen <dave.hansen@...ux.intel.com>,
Andy Lutomirski <luto@...nel.org>,
Thomas Gleixner <tglx@...utronix.de>,
Ingo Molnar <mingo@...hat.com>, Borislav Petkov <bp@...en8.de>,
"H. Peter Anvin" <hpa@...or.com>, x86@...nel.org,
linux-mm@...ck.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH, RFC] x86/mm/pat: Restore large pages after fragmentation
On Fri, Apr 17, 2020 at 05:42:44PM +0300, Kirill A. Shutemov wrote:
> On Fri, Apr 17, 2020 at 02:18:28PM +0200, Peter Zijlstra wrote:
> Do you mean that it is not aligned to '(' on the previous line?
>
> Okay, I'll fix it up (and change my vim setup). But I find indenting with
> spaces disgusting.
set cino=:0(0
> > > static void cpa_flush(struct cpa_data *data, int cache)
> > > {
> > > + LIST_HEAD(pgtables);
> > > + struct page *page, *tmp;
> >
> > xmas fail
>
> Emm. What are rules here?
>
> > > struct cpa_data *cpa = data;
> > > unsigned int i;
Basically we (tip) strive for the inverse xmas tree thing, so here that
would be:
struct cpa_data *cpa = data;
+ struct page *page, *tmp;
+ LIST_HEAD(pgtables);
unsigned int i;
> > > + pr_debug("2M restored at %#lx\n", addr);
> >
> > While I appreciate it's usefulness while writing this, I do think we can
> > do without that print once we know it works.
>
> Even with pr_debug()? I shouldn't be noisy for most users.
I always have debug on; also there is no counterpart on split either.
> > > +/*
> > > + * Restore PMD and PUD pages in the kernel mapping around the address where
> > > + * possible.
> > > + *
> > > + * Caller must flush TLB and free page tables queued on the list before
> > > + * touching the new entries. CPU must not see TLB entries of different size
> > > + * with different attributes.
> > > + */
> > > +static void restore_large_pages(unsigned long addr, struct list_head *pgtables)
> > > +{
> > > + pgd_t *pgd;
> > > + p4d_t *p4d;
> > > + pud_t *pud;
> > > +
> > > + addr &= PUD_MASK;
> > > +
> > > + spin_lock(&pgd_lock);
> > > + pgd = pgd_offset_k(addr);
> > > + if (pgd_none(*pgd))
> > > + goto out;
> > > + p4d = p4d_offset(pgd, addr);
> > > + if (p4d_none(*p4d))
> > > + goto out;
> > > + pud = pud_offset(p4d, addr);
> > > + if (!pud_present(*pud) || pud_large(*pud))
> > > + goto out;
> > > +
> > > + restore_pud_page(pud, addr, pgtables);
> > > +out:
> > > + spin_unlock(&pgd_lock);
> > > +}
> >
> > I find this odd, at best. AFAICT this does not attempt to reconstruct a
> > PMD around @addr when possible. When the first PMD of the PUD can't be
> > reconstructed, we give up entirely.
>
> No, you misread. If the first PMD is not suitable, we give up
> reconstructing PUD page, but we still look at all PMD of the PUD range.
Aah, indeed. I got my brain in a twist reading that pud function.
> But this might be excessive. What you are proposing below should be fine
> and more efficient. If we do everything right the rest of PMDs in the PUD
> have to already large or not suitable for reconstructing.
Just so.
> We might still look at the rest of PMDs for CONFIG_CPA_DEBUG, just to make
> sure we don't miss some corner case where we didn't restore a PMD.
>
> (Also I think about s/restore/reconstruct/g)
Right, and WARN if they do succeed collapsing ;-)
Powered by blists - more mailing lists