lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CANzGp4JobgUNCUXrv-kXw_NFBCNp7pO9bGwUBJxMDjca8P7G-w@mail.gmail.com>
Date: Mon, 29 Jul 2024 21:32:20 +0800
From: Chuanhua Han <chuanhuahan@...il.com>
To: Matthew Wilcox <willy@...radead.org>
Cc: Barry Song <21cnbao@...il.com>, akpm@...ux-foundation.org, linux-mm@...ck.org, 
	ying.huang@...el.com, baolin.wang@...ux.alibaba.com, chrisl@...nel.org, 
	david@...hat.com, hannes@...xchg.org, hughd@...gle.com, 
	kaleshsingh@...gle.com, kasong@...cent.com, linux-kernel@...r.kernel.org, 
	mhocko@...e.com, minchan@...nel.org, nphamcs@...il.com, ryan.roberts@....com, 
	senozhatsky@...omium.org, shakeel.butt@...ux.dev, shy828301@...il.com, 
	surenb@...gle.com, v-songbaohua@...o.com, xiang@...nel.org, 
	yosryahmed@...gle.com, Chuanhua Han <hanchuanhua@...o.com>
Subject: Re: [PATCH v5 3/4] mm: support large folios swapin as a whole for
 zRAM-like swapfile

Matthew Wilcox <willy@...radead.org> 于2024年7月29日周一 20:55写道:
>
> On Mon, Jul 29, 2024 at 02:36:38PM +0800, Chuanhua Han wrote:
> > Matthew Wilcox <willy@...radead.org> 于2024年7月29日周一 11:51写道:
> > >
> > > On Fri, Jul 26, 2024 at 09:46:17PM +1200, Barry Song wrote:
> > > > -                     folio = vma_alloc_folio(GFP_HIGHUSER_MOVABLE, 0,
> > > > -                                             vma, vmf->address, false);
> > > > +                     folio = alloc_swap_folio(vmf);
> > > >                       page = &folio->page;
> > >
> > > This is no longer correct.  You need to set 'page' to the precise page
> > > that is being faulted rather than the first page of the folio.  It was
> > > fine before because it always allocated a single-page folio, but now it
> > > must use folio_page() or folio_file_page() (whichever has the correct
> > > semantics for you).
> > >
> > > Also you need to fix your test suite to notice this bug.  I suggest
> > > doing that first so that you know whether you've got the calculation
> > > correct.
> >
> > >
> > >
> > This is no problem now, we support large folios swapin as a whole, so
> > the head page is used here instead of the page that is being faulted.
> > You can also refer to the current code context, now support large
> > folios swapin as a whole, and previously only support small page
> > swapin is not the same.
>
> You have completely failed to understand the problem.  Let's try it this
> way:
>
> We take a page fault at address 0x123456789000.
> If part of a 16KiB folio, that's page 1 of the folio at 0x123456788000.
> If you now map page 0 of the folio at 0x123456789000, you've
> given the user the wrong page!  That looks like data corruption.
The user does not get the wrong data because we are mapping the whole,
and for 16KiB folio, we map 16KiB through the page table.
>
> The code in
>         if (folio_test_large(folio) && folio_test_swapcache(folio)) {
> as Barry pointed out will save you -- but what if those conditions fail?
> What if the mmap has been mremap()ed and the folio now crosses a PMD
> boundary?  mk_pte() will now be called on the wrong page.
These special cases have been dealt with in our patch. For mthp's
large folio, mk_pte uses head page to construct pte.


-- 
Thanks,
Chuanhua

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ