lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <126046ce506df070d57e6fe5ab9c92cdaf4cf9b7.camel@intel.com>
Date:   Tue, 20 Dec 2022 08:33:05 +0000
From:   "Huang, Kai" <kai.huang@...el.com>
To:     "chao.p.peng@...ux.intel.com" <chao.p.peng@...ux.intel.com>
CC:     "tglx@...utronix.de" <tglx@...utronix.de>,
        "linux-arch@...r.kernel.org" <linux-arch@...r.kernel.org>,
        "kvm@...r.kernel.org" <kvm@...r.kernel.org>,
        "jmattson@...gle.com" <jmattson@...gle.com>,
        "Lutomirski, Andy" <luto@...nel.org>,
        "ak@...ux.intel.com" <ak@...ux.intel.com>,
        "kirill.shutemov@...ux.intel.com" <kirill.shutemov@...ux.intel.com>,
        "Hocko, Michal" <mhocko@...e.com>,
        "qemu-devel@...gnu.org" <qemu-devel@...gnu.org>,
        "tabba@...gle.com" <tabba@...gle.com>,
        "david@...hat.com" <david@...hat.com>,
        "michael.roth@....com" <michael.roth@....com>,
        "corbet@....net" <corbet@....net>,
        "bfields@...ldses.org" <bfields@...ldses.org>,
        "dhildenb@...hat.com" <dhildenb@...hat.com>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        "linux-fsdevel@...r.kernel.org" <linux-fsdevel@...r.kernel.org>,
        "x86@...nel.org" <x86@...nel.org>, "bp@...en8.de" <bp@...en8.de>,
        "linux-api@...r.kernel.org" <linux-api@...r.kernel.org>,
        "rppt@...nel.org" <rppt@...nel.org>,
        "shuah@...nel.org" <shuah@...nel.org>,
        "vkuznets@...hat.com" <vkuznets@...hat.com>,
        "vbabka@...e.cz" <vbabka@...e.cz>,
        "mail@...iej.szmigiero.name" <mail@...iej.szmigiero.name>,
        "ddutile@...hat.com" <ddutile@...hat.com>,
        "qperret@...gle.com" <qperret@...gle.com>,
        "arnd@...db.de" <arnd@...db.de>,
        "pbonzini@...hat.com" <pbonzini@...hat.com>,
        "vannapurve@...gle.com" <vannapurve@...gle.com>,
        "naoya.horiguchi@....com" <naoya.horiguchi@....com>,
        "Christopherson,, Sean" <seanjc@...gle.com>,
        "wanpengli@...cent.com" <wanpengli@...cent.com>,
        "yu.c.zhang@...ux.intel.com" <yu.c.zhang@...ux.intel.com>,
        "hughd@...gle.com" <hughd@...gle.com>,
        "aarcange@...hat.com" <aarcange@...hat.com>,
        "mingo@...hat.com" <mingo@...hat.com>,
        "hpa@...or.com" <hpa@...or.com>,
        "Nakajima, Jun" <jun.nakajima@...el.com>,
        "jlayton@...nel.org" <jlayton@...nel.org>,
        "joro@...tes.org" <joro@...tes.org>,
        "linux-mm@...ck.org" <linux-mm@...ck.org>,
        "Wang, Wei W" <wei.w.wang@...el.com>,
        "steven.price@....com" <steven.price@....com>,
        "linux-doc@...r.kernel.org" <linux-doc@...r.kernel.org>,
        "Hansen, Dave" <dave.hansen@...el.com>,
        "akpm@...ux-foundation.org" <akpm@...ux-foundation.org>,
        "linmiaohe@...wei.com" <linmiaohe@...wei.com>
Subject: Re: [PATCH v10 1/9] mm: Introduce memfd_restricted system call to
 create restricted user memory

On Tue, 2022-12-20 at 15:22 +0800, Chao Peng wrote:
> On Mon, Dec 19, 2022 at 08:48:10AM +0000, Huang, Kai wrote:
> > On Mon, 2022-12-19 at 15:53 +0800, Chao Peng wrote:
> > > > 
> > > > [...]
> > > > 
> > > > > +
> > > > > +	/*
> > > > > +	 * These pages are currently unmovable so don't place them into
> > > > > movable
> > > > > +	 * pageblocks (e.g. CMA and ZONE_MOVABLE).
> > > > > +	 */
> > > > > +	mapping = memfd->f_mapping;
> > > > > +	mapping_set_unevictable(mapping);
> > > > > +	mapping_set_gfp_mask(mapping,
> > > > > +			     mapping_gfp_mask(mapping) & ~__GFP_MOVABLE);
> > > > 
> > > > But, IIUC removing __GFP_MOVABLE flag here only makes page allocation from
> > > > non-
> > > > movable zones, but doesn't necessarily prevent page from being migrated.  My
> > > > first glance is you need to implement either a_ops->migrate_folio() or just
> > > > get_page() after faulting in the page to prevent.
> > > 
> > > The current api restrictedmem_get_page() already does this, after the
> > > caller calling it, it holds a reference to the page. The caller then
> > > decides when to call put_page() appropriately.
> > 
> > I tried to dig some history. Perhaps I am missing something, but it seems Kirill
> > said in v9 that this code doesn't prevent page migration, and we need to
> > increase page refcount in restrictedmem_get_page():
> > 
> > https://lore.kernel.org/linux-mm/20221129112139.usp6dqhbih47qpjl@box.shutemov.name/
> > 
> > But looking at this series it seems restrictedmem_get_page() in this v10 is
> > identical to the one in v9 (except v10 uses 'folio' instead of 'page')?
> 
> restrictedmem_get_page() increases page refcount several versions ago so
> no change in v10 is needed. You probably missed my reply:
> 
> https://lore.kernel.org/linux-mm/20221129135844.GA902164@chaop.bj.intel.com/

But for non-restricted-mem case, it is correct for KVM to decrease page's
refcount after setting up mapping in the secondary mmu, otherwise the page will
be pinned by KVM for normal VM (since KVM uses GUP to get the page).

So what we are expecting is: for KVM if the page comes from restricted mem, then
KVM cannot decrease the refcount, otherwise for normal page via GUP KVM should.

> 
> The current solution is clear: unless we have better approach, we will
> let restrictedmem user (KVM in this case) to hold the refcount to
> prevent page migration.
> 

OK.  Will leave to others :)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ