lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAGsJ_4yOrn3OP-jP+hZxrb__AarK+U7Pc15z0XmQ26RmT7h8xQ@mail.gmail.com>
Date: Thu, 18 Apr 2024 21:55:07 +1200
From: Barry Song <21cnbao@...il.com>
To: "Huang, Ying" <ying.huang@...el.com>, Khalid Aziz <khalid.aziz@...cle.com>, 
	sparclinux@...r.kernel.org
Cc: akpm@...ux-foundation.org, linux-mm@...ck.org, 
	baolin.wang@...ux.alibaba.com, chrisl@...nel.org, david@...hat.com, 
	hanchuanhua@...o.com, hannes@...xchg.org, hughd@...gle.com, 
	kasong@...cent.com, ryan.roberts@....com, surenb@...gle.com, 
	v-songbaohua@...o.com, willy@...radead.org, xiang@...nel.org, 
	yosryahmed@...gle.com, yuzhao@...gle.com, ziy@...dia.com, 
	linux-kernel@...r.kernel.org
Subject: Re: [PATCH v2 4/5] mm: swap: entirely map large folios found in swapcache

[snip]

> > >> >
> > >> >       VM_BUG_ON(!folio_test_anon(folio) ||
> > >> >                       (pte_write(pte) && !PageAnonExclusive(page)));
> > >> > -     set_pte_at(vma->vm_mm, vmf->address, vmf->pte, pte);
> > >> > -     arch_do_swap_page(vma->vm_mm, vma, vmf->address, pte, vmf->orig_pte);
> > >> > +     set_ptes(vma->vm_mm, start_address, start_pte, pte, nr_pages);
> > >> > +     vmf->orig_pte = ptep_get(vmf->pte);
> > >> > +     arch_do_swap_page(vma->vm_mm, vma, start_address, pte, pte);
> > >>
> > >> Do we need to call arch_do_swap_page() for each subpage?  IIUC, the
> > >> corresponding arch_unmap_one() will be called for each subpage.
> > >
> > > i actually thought about this very carefully, right now, the only one who
> > > needs this is sparc and it doesn't support THP_SWAPOUT at all. and
> > > there is no proof doing restoration one by one won't really break sparc.
> > > so i'd like to defer this to when sparc really needs THP_SWAPOUT.
> >
> > Let's ask SPARC developer (Cced) for this.
> >
> > IMHO, even if we cannot get help, we need to change code with our
> > understanding instead of deferring it.
>
> ok. Thanks for Ccing sparc developers.

Hi Khalid & Ying (also Cced sparc maillist),

SPARC is the only platform which needs arch_do_swap_page(), right now,
its THP_SWAPOUT is not enabled. so we will not really hit a large folio
in swapcache. just in case you might need THP_SWAPOUT later, i am
changing the code as below,

@@ -4286,7 +4285,11 @@ vm_fault_t do_swap_page(struct vm_fault *vmf)
        VM_BUG_ON(!folio_test_anon(folio) ||
                        (pte_write(pte) && !PageAnonExclusive(page)));
        set_ptes(vma->vm_mm, start_address, start_ptep, pte, nr_pages);
-       arch_do_swap_page(vma->vm_mm, vma, start_address, pte, pte);
+       for (int i = 0; i < nr_pages; i++) {
+               arch_do_swap_page(vma->vm_mm, vma, start_address + i *
PAGE_SIZE,
+                                 pte, pte);
+               pte = pte_advance_pfn(pte, 1);
+       }

        folio_unlock(folio);
        if (folio != swapcache && swapcache) {

for sparc, nr_pages will always be 1(THP_SWAPOUT not enabled). for
arm64/x86/riscv,
it seems redundant to do a for loop "for (int i = 0; i < nr_pages; i++)".

so another option is adding a helper as below to avoid the idle loop
for arm64/x86/riscv etc.

diff --git a/include/linux/pgtable.h b/include/linux/pgtable.h
index e2f45e22a6d1..ea314a5f9b5e 100644
--- a/include/linux/pgtable.h
+++ b/include/linux/pgtable.h
@@ -1085,6 +1085,28 @@ static inline void arch_do_swap_page(struct
mm_struct *mm,
 {

 }
+
+static inline void arch_do_swap_page_nr(struct mm_struct *mm,
+                                    struct vm_area_struct *vma,
+                                    unsigned long addr,
+                                    pte_t pte, pte_t oldpte,
+                                    int nr)
+{
+
+}
+#else
+static inline void arch_do_swap_page_nr(struct mm_struct *mm,
+                                    struct vm_area_struct *vma,
+                                    unsigned long addr,
+                                    pte_t pte, pte_t oldpte,
+                                    int nr)
+{
+       for (int i = 0; i < nr; i++) {
+               arch_do_swap_page(vma->vm_mm, vma, addr + i * PAGE_SIZE,
+                                pte_advance_pfn(pte, i),
+                                pte_advance_pfn(oldpte, i));
+       }
+}
 #endif

Please tell me your preference.

BTW, i found oldpte and pte are always same in do_swap_page(), is it
something wrong? does arch_do_swap_page() really need two same
arguments?


vmf->orig_pte = pte;
..
arch_do_swap_page(vma->vm_mm, vma, vmf->address, pte, vmf->orig_pte);


>
> >
> > > on the other hand, it seems really bad we have both
> > > arch_swap_restore  - for this, arm64 has moved to using folio
> > > and
> > > arch_do_swap_page
> > >
> > > we should somehow unify them later if sparc wants THP_SWPOUT.

Thanks
Barry

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ