[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Zh4cqZkuPR9V1t1o@li-008a6a4c-3549-11b2-a85c-c5cc2836eea2.ibm.com>
Date: Tue, 16 Apr 2024 08:37:29 +0200
From: Alexander Gordeev <agordeev@...ux.ibm.com>
To: David Hildenbrand <david@...hat.com>,
Christian Borntraeger <borntraeger@...ux.ibm.com>
Cc: linux-kernel@...r.kernel.org, linux-mm@...ck.org,
Janosch Frank <frankja@...ux.ibm.com>,
Claudio Imbrenda <imbrenda@...ux.ibm.com>,
Heiko Carstens <hca@...ux.ibm.com>, Vasily Gorbik <gor@...ux.ibm.com>,
Andrew Morton <akpm@...ux-foundation.org>,
Peter Xu <peterx@...hat.com>, Sven Schnelle <svens@...ux.ibm.com>,
Gerald Schaefer <gerald.schaefer@...ux.ibm.com>,
Andrea Arcangeli <aarcange@...hat.com>, kvm@...r.kernel.org,
linux-s390@...r.kernel.org
Subject: Re: [PATCH v3 2/2] s390/mm: re-enable the shared zeropage for !PV
and !skeys KVM guests
On Mon, Apr 15, 2024 at 09:14:03PM +0200, David Hildenbrand wrote:
> > > +retry:
> > > + rc = walk_page_range_vma(vma, addr, vma->vm_end,
> > > + &find_zeropage_ops, &addr);
> > > + if (rc <= 0)
> > > + continue;
> >
> > So in case an error is returned for the last vma, __s390_unshare_zeropage()
> > finishes with that error. By contrast, the error for a non-last vma would
> > be ignored?
>
> Right, it looks a bit off. walk_page_range_vma() shouldn't fail
> unless find_zeropage_pte_entry() would fail -- which would also be
> very unexpected.
>
> To handle it cleanly in case we would ever get a weird zeropage where we
> don't expect it, we should probably just exit early.
>
> Something like the following (not compiled, addressing the comment below):
> @@ -2618,7 +2618,8 @@ static int __s390_unshare_zeropages(struct mm_struct *mm)
> struct vm_area_struct *vma;
> VMA_ITERATOR(vmi, mm, 0);
> unsigned long addr;
> - int rc;
> + vm_fault_t rc;
> + int zero_page;
I would use "fault" for mm faults (just like everywhere else handle_mm_fault() is
called) and leave rc as is:
vm_fault_t fault;
int rc;
> for_each_vma(vmi, vma) {
> /*
> @@ -2631,9 +2632,11 @@ static int __s390_unshare_zeropages(struct mm_struct *mm)
> addr = vma->vm_start;
> retry:
> - rc = walk_page_range_vma(vma, addr, vma->vm_end,
> - &find_zeropage_ops, &addr);
> - if (rc <= 0)
> + zero_page = walk_page_range_vma(vma, addr, vma->vm_end,
> + &find_zeropage_ops, &addr);
> + if (zero_page < 0)
> + return zero_page;
> + else if (!zero_page)
> continue;
> /* addr was updated by find_zeropage_pte_entry() */
> @@ -2656,7 +2659,7 @@ static int __s390_unshare_zeropages(struct mm_struct *mm)
> goto retry;
> }
> - return rc;
> + return 0;
> }
> static int __s390_disable_cow_sharing(struct mm_struct *mm)
..
> > > + /* addr was updated by find_zeropage_pte_entry() */
> > > + rc = handle_mm_fault(vma, addr,
> > > + FAULT_FLAG_UNSHARE | FAULT_FLAG_REMOTE,
> > > + NULL);
> > > + if (rc & VM_FAULT_OOM)
> > > + return -ENOMEM;
> >
> > Heiko pointed out that rc type is inconsistent vs vm_fault_t returned by
>
> Right, let's use another variable for that.
>
> > handle_mm_fault(). While fixing it up, I've got concerned whether is it
> > fine to continue in case any other error is met (including possible future
> > VM_FAULT_xxxx)?
>
> Such future changes would similarly break break_ksm(). Staring at it, I do wonder
> if break_ksm() should be handling VM_FAULT_HWPOISON ... very likely we should
> handle it and fail -- we might get an MC while copying from the source page.
>
> VM_FAULT_HWPOISON on the shared zeropage would imply a lot of trouble, so
> I'm not concerned about that for the case here, but handling it in the future
> would be cleaner.
>
> Note that we always retry the lookup, so we won't just skip a zeropage on unexpected
> errors.
>
> We could piggy-back on vm_fault_to_errno(). We could use
> vm_fault_to_errno(rc, FOLL_HWPOISON), and only continue (retry) if the rc is 0 or
> -EFAULT, otherwise fail with the returned error.
>
> But I'd do that as a follow up, and also use it in break_ksm() in the same fashion.
@Christian, do you agree with this suggestion?
Thanks!
Powered by blists - more mailing lists