lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20190930233538.GA13922@redhat.com>
Date:   Mon, 30 Sep 2019 19:35:38 -0400
From:   Andrea Arcangeli <aarcange@...hat.com>
To:     Alex Williamson <alex.williamson@...hat.com>
Cc:     Ben Luo <luoben@...ux.alibaba.com>, cohuck@...hat.com,
        linux-kernel@...r.kernel.org
Subject: Re: [PATCH v2] vfio/type1: avoid redundant PageReserved checking

Hello,

On Fri, Sep 13, 2019 at 12:05:26PM -0600, Alex Williamson wrote:
> On Mon, 2 Sep 2019 15:32:42 +0800
> Ben Luo <luoben@...ux.alibaba.com> wrote:
> 
> > 在 2019/8/30 上午1:06, Alex Williamson 写道:
> > > On Fri, 30 Aug 2019 00:58:22 +0800
> > > Ben Luo <luoben@...ux.alibaba.com> wrote:
> > >  
> > >> 在 2019/8/28 下午11:55, Alex Williamson 写道:  
> > >>> On Wed, 28 Aug 2019 12:28:04 +0800
> > >>> Ben Luo <luoben@...ux.alibaba.com> wrote:
> > >>>     
> > >>>> currently, if the page is not a tail of compound page, it will be
> > >>>> checked twice for the same thing.
> > >>>>
> > >>>> Signed-off-by: Ben Luo <luoben@...ux.alibaba.com>
> > >>>> ---
> > >>>>    drivers/vfio/vfio_iommu_type1.c | 3 +--
> > >>>>    1 file changed, 1 insertion(+), 2 deletions(-)
> > >>>>
> > >>>> diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
> > >>>> index 054391f..d0f7346 100644
> > >>>> --- a/drivers/vfio/vfio_iommu_type1.c
> > >>>> +++ b/drivers/vfio/vfio_iommu_type1.c
> > >>>> @@ -291,11 +291,10 @@ static int vfio_lock_acct(struct vfio_dma *dma, long npage, bool async)
> > >>>>    static bool is_invalid_reserved_pfn(unsigned long pfn)
> > >>>>    {
> > >>>>    	if (pfn_valid(pfn)) {
> > >>>> -		bool reserved;
> > >>>>    		struct page *tail = pfn_to_page(pfn);
> > >>>>    		struct page *head = compound_head(tail);
> > >>>> -		reserved = !!(PageReserved(head));
> > >>>>    		if (head != tail) {
> > >>>> +			bool reserved = PageReserved(head);
> > >>>>    			/*
> > >>>>    			 * "head" is not a dangling pointer
> > >>>>    			 * (compound_head takes care of that)  
> > >>> Thinking more about this, the code here was originally just a copy of
> > >>> kvm_is_mmio_pfn() which was simplified in v3.12 with the commit below.
> > >>> Should we instead do the same thing here?  Thanks,
> > >>>
> > >>> Alex  
> > >> ok, and kvm_is_mmio_pfn() has also been updated since then, I will take
> > >> a look at that and compose a new patch  
> > > I'm not sure if the further updates are quite as relevant for vfio, but
> > > appreciate your review of them.  Thanks,
> > >
> > > Alex  
> > 
> > After studying the related code, my personal understandings are:
> > 
> > kvm_is_mmio_pfn() is used to find out whether a memory range is MMIO 
> > mapped so that to set
> > the proper MTRR TYPE to spte.
> > 
> > is_invalid_reserved_pfn() is used in two scenarios:
> >      1. to tell whether a page should be counted against user's mlock 
> > limits, as the function's name
> > implies, all 'invalid' PFNs who are not backed by struct page and those 
> > reserved pages (including
> > zero page and those from NVDIMM DAX) should be excluded.
> > 2. to check if we have got a valid and pinned pfn for the vma 
> > with VM_PFNMAP flag.
> > 
> > So, for the zero page and 'RAM' backed PFNs without 'struct page', 
> > kvm_is_mmio_pfn() should
> > return false because they are not MMIO and are cacheable, but 
> > is_invalid_reserved_pfn() should
> > return true since they are truely reserved or invalid and should not be 
> > counted against user's
> > mlock limits.
> > 
> > For fsdax-page, current get_user_pages() returns -EOPNOTSUPP, and VFIO 
> > also returns this
> > error code to user, seems not support fsdax for now, so there is no 
> > chance to call into
> > is_invalid_reserved_pfn() currently, if fsdax is to be supported, not 
> > only this function needs to be
> > updated, vaddr_get_pfn() also needs some changes.
> > 
> > Now, with the assumption that PFNs of compound pages with reserved bit 
> > set in head will not be
> > passed to is_invalid_reserved_pfn(), we can simplify this function to:
> > 
> > static bool is_invalid_reserved_pfn(unsigned long pfn)
> > {
> >          if (pfn_valid(pfn))
> >                  return PageReserved(pfn_to_page(pfn));
> > 
> >          return true;
> > }
> > 
> > But, I am not sure if the assumption above is true, if not, we still 
> > need to check the reserved bit of
> > head for a tail page as this PATCH v2 does.
> 
> I believe what you've described is correct.  Andrea, have we missed
> anything?  Thanks,

Yes it looks good. The only reason for ever wanting to check the head
page reserved bit (instead of only checking the tail page reserved
bit) would be if any code would transfer the reserved bit from head to
tail during a hugepage split, but no hugepage split code can transfer
the reserved bit from head to tail during the split, so checking the
head can't make a difference.

The buddy wouldn't allow the driver to allocate an hugepage if any
subpage is reserved in the e820 map at boot, so non-RAM pages with a
backing struct page aren't an issue here. This was only meaningful for
PFNMAP in case the PG_reserved bit was set by the driver on a hugepage
before mapping it in userland, in which case the driver needs to set
the reserved bit in all subpages to be safe (not only in the head).

Thanks,
Andrea

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ