lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <20090126155246.2d7df309.akpm@linux-foundation.org>
Date:	Mon, 26 Jan 2009 15:52:46 -0800
From:	Andrew Morton <akpm@...ux-foundation.org>
To:	Ying Han <yinghan@...gle.com>
Cc:	linux-mm@...ck.org, linux-kernel@...r.kernel.org, mingo@...e.hu,
	mikew@...gle.com, rientjes@...gle.com, rohitseth@...gle.com,
	hugh@...itas.com, a.p.zijlstra@...llo.nl, hpa@...or.com,
	edwintorok@...il.com, lee.schermerhorn@...com, npiggin@...e.de
Subject: Re: [RFC v2][PATCH]page_fault retry with NOPAGE_RETRY

On Mon, 26 Jan 2009 15:08:48 -0800
Ying Han <yinghan@...gle.com> wrote:

> On Mon, Jan 26, 2009 at 11:37 AM, Andrew Morton
> <akpm@...ux-foundation.org>wrote:
> 
> > On Fri, 5 Dec 2008 11:40:19 -0800
> > Ying Han <yinghan@...gle.com> wrote:
> >
> > > --- a/arch/x86/mm/fault.c
> > > +++ b/arch/x86/mm/fault.c
> > > @@ -591,6 +591,7 @@ void __kprobes do_page_fault(struct pt_regs *regs,
> > unsigne
> > >  #ifdef CONFIG_X86_64
> > >       unsigned long flags;
> > >  #endif
> > > +     unsigned int retry_flag = FAULT_FLAG_RETRY;
> > >
> > >       tsk = current;
> > >       mm = tsk->mm;
> > > @@ -689,6 +690,7 @@ again:
> > >               down_read(&mm->mmap_sem);
> > >       }
> > >
> > > +retry:
> > >       vma = find_vma(mm, address);
> > >       if (!vma)
> > >               goto bad_area;
> > > @@ -715,6 +717,7 @@ again:
> > >  good_area:
> > >       si_code = SEGV_ACCERR;
> > >       write = 0;
> > > +     write |= retry_flag;
> > >       switch (error_code & (PF_PROT|PF_WRITE)) {
> > >       default:        /* 3: write, present */
> > >               /* fall through */
> > > @@ -743,6 +746,15 @@ good_area:
> > >                       goto do_sigbus;
> > >               BUG();
> > >       }
> > > +
> > > +     if (fault & VM_FAULT_RETRY) {
> > > +             if (write & FAULT_FLAG_RETRY) {
> > > +                     retry_flag &= ~FAULT_FLAG_RETRY;
> > > +                     goto retry;
> > > +             }
> > > +             BUG();
> > > +     }
> > > +
> > >       if (fault & VM_FAULT_MAJOR)
> > >               tsk->maj_flt++;
> > >       else
> >
> > This code is mixing flags from the FAULT_FLAG_foor domain into local
> > variable `write'.  But that's inappropriate because `write' is a
> > boolean, and in one of Ingo's trees, `write' gets bits other than bit 0
> > set, and it all generally ends up a mess.
> >
> > Can we not do that?  I assume that a previous version of this patch
> > kept those things separated?
> >
> > Something like this, I think?
> >
> > diff -puN arch/x86/mm/fault.c~page_fault-retry-with-nopage_retry-fix
> > arch/x86/mm/fault.c
> > --- a/arch/x86/mm/fault.c~page_fault-retry-with-nopage_retry-fix
> > +++ a/arch/x86/mm/fault.c
> > @@ -799,7 +799,7 @@ void __kprobes do_page_fault(struct pt_r
> >        struct vm_area_struct *vma;
> >        int write;
> >        int fault;
> > -       unsigned int retry_flag = FAULT_FLAG_RETRY;
> > +       int retry_flag = 1;
> >
> >        tsk = current;
> >        mm = tsk->mm;
> > @@ -951,6 +951,7 @@ good_area:
> >        }
> >
> >        write |= retry_flag;
> > +
> >        /*
> >         * If for any reason at all we couldn't handle the fault,
> >         * make sure we exit gracefully rather than endlessly redo
> > @@ -969,8 +970,8 @@ good_area:
> >         * be removed or changed after the retry.
> >         */
> >         if (fault & VM_FAULT_RETRY) {
> > -               if (write & FAULT_FLAG_RETRY) {
> > -                       retry_flag &= ~FAULT_FLAG_RETRY;
> > +               if (retry_flag) {
> > +                       retry_flag = 0;
> >                        goto retry;
> >                }
> >                BUG();
> 
> with this change, 'write' still gets bits other than bit 0
> set in the case of 'write, not present' and the Ingo's problem remains, am i
> missing something here?

umm, yes.  This?

--- a/arch/x86/mm/fault.c~page_fault-retry-with-nopage_retry-fix-fix
+++ a/arch/x86/mm/fault.c
@@ -950,8 +950,6 @@ good_area:
 		return;
 	}
 
-	write |= retry_flag;
-
 	/*
 	 * If for any reason at all we couldn't handle the fault,
 	 * make sure we exit gracefully rather than endlessly redo
_


(I should just give up here - doing too many things at once)

> >
> >
> >
> > Question: why is this code passing `write==true' into handle_mm_fault()
> > in the retry case?
> > Here i am using unused bit of "write" to carry FAULT_FLAG_RETRY flag down
> > to the handle_mm_fault(). Meanwhile, "write" still have its read/write bit
> > set as it is before. It is true that 'write == true' in the retry patch, but
> > i did the correct interpretation in
> 
> 
> 
> > static int do_linear_fault() {
> >
>          int write = write_access & ~FAULT_FLAG_RETRY;
>          unsigned int flags = (write ? FAULT_FLAG_WRITE : 0);
> 
>          flags |= (write_access & FAULT_FLAG_RETRY);
>          pte_unmap(page_table);
>          return __do_fault(mm, vma, address, pmd, pgoff, flags, orig_pte);
> }


OK, this is horridly confusing.  Is `write_access' a boolean, as its
name implies, or is it a bunch of flags?

If we're going to turn it into a bunch of flags then it should be
renamed!  And callsites such as do_page_fault() should rename their
local variable `write' to something which accurately conveys the new
usage.  And various code comments in mm/memory.c (which don't appear to
exist) should be updated.

I think that a good way to present this is as a preparatory patch:
"convert the fourth argument to handle_mm_fault() from a boolean to a
flags word".  That would be a simple do-nothing patch which affects all
architectures and which ideally would break the build at any
unconverted code sites.  (Change the argument order?)

What do you think?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ