linux-kernel - Re: [PATCH 0/3] ksm: write protect pages from inside ksm

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20090615035749.6f8236cc@woof.tlv.redhat.com>
Date:	Mon, 15 Jun 2009 03:57:49 +0300
From:	Izik Eidus <ieidus@...hat.com>
To:	Izik Eidus <ieidus@...hat.com>
Cc:	Hugh Dickins <hugh.dickins@...cali.co.uk>,
	linux-kernel@...r.kernel.org
Subject: Re: [PATCH 0/3] ksm: write protect pages from inside ksm

On Mon, 15 Jun 2009 03:05:14 +0300
Izik Eidus <ieidus@...hat.com> wrote:

> Izik Eidus wrote:
> > Izik Eidus wrote:
> >> Hugh Dickins wrote:
> >>> On Sat, 13 Jun 2009, Izik Eidus wrote:
> >>>  
> >>>> Hugh, so untill here we are sync,
> >>>>     
> >>>
> >>> Yes, that fits with what I have here, thanks (or where it didn't
> >>> quite fit, e.g. ' versus `, I've adjusted to what you have!).  And
> >>> thanks for fixing my *orig_pte = *ptep bug, you did point that out
> >>> before, but I misunderstood at first.
> >>>
> >>>  
> >>>> Question is what you want me to do now?,
> >>>> (Beacuse we are skipping 2.6.31, It is ok to you to tell me
> >>>> something like: "Shut up and let me see what i can get with this
> >>>> madvise" - that from one side.
> >>>>  From another side if you want me to do anything please say.
> >>>>     
> >>>
> >>> I had to get a bit further at my end before answering on that,
> >>> but now the answer is clear: please do some testing of your RFC
> >>> madvise() version (which is what I'm just tidying up a little),
> >>> and let me know any bugfixes you find.  Try with SLAB or SLUB or
> >>> SLQB debug on e.g. CONFIG_SLUB=y, CONFIG_SLUB_DEBUG=y and boot
> >>> option "slub_debug".
> >>>   
> >>
> >> Sure, let me check it.
> >> (You do have Andrea patch that fix the "used after free slab
> >> entries" ?)
> >
> > How fast is it crush opps to you?, I compiled it and ran it here on 
> > 2.6.30-rc4-mm1 with:
> > "Enable SLQB debugging support" and "SLQB debugging on by default,
> > and it run and merge (i am using qemu processes to run virtual
> > machines to merge the pages between them)
> >
> > ("SLQB debugging on by defaul" mean i dont have to add boot
> > pareameter right?)
> >
> > Maybe i should try update into newer version of the mm tree? (last 
> > commit here is Jul 22)
> 
> OK, bug on my side, just got that oppss, will try to fix and send
> patch.
> 
> (Sorry for the noise)
> 
> >
> >>
> >>> I'm finding, whether with your RFC or my tidyup, that kksmd
> >>> soon oopses in get_next_mmlist (or perhaps find_vma): presumably
> >>> accessing a vma or mm which already got freed (if you don't have
> >>> slab debugging on, it's liable to hang instead).
> >>>
> >>> (I've also not seen it actually merging yet: if you register
> >>> or madvise a large anon area and memset it, the /dev/ksm version
> >>> would merge all its pages, but I've not seen the madvise version
> >>> do so yet - though maybe there's something stupidly wrong in my
> >>> testing, really I'm more worried about the oopses at present.)
> >>>
> >>> Note that mmotm includes a patch of Nick's which adds a function
> >>> madvise_behavior_valid() - you'll need to add your MADVs into its
> >>> list to get it to work at all there.
> >>>
> >>> Here's a patch I added a month or so ago, when trying to
> >>> experiment with KSM on all mms: shouldn't be necessary if your mm
> >>> refcounting is right, but might help to avoid extra weirdness
> >>> when things go wrong: exit_mmap() leaves stale vma pointers
> >>> around, reckoning that nobody can be interested by now; but maybe
> >>> KSM might peep so better to tidy them up at least while
> >>> debugging...
> >>>
> >>> Thanks,
> >>> Hugh
> >>>
> >>> --- old/mm/mmap.c    2009-05-01 13:47:45.000000000 +0100
> >>> +++ new/mm/mmap.c    2009-05-03 11:34:47.000000000 +0100
> >>> @@ -2112,6 +2112,14 @@ void exit_mmap(struct mm_struct *mm)
> >>>      tlb_finish_mmu(tlb, 0, end);
> >>>  
> >>>      /*
> >>> +     * Make sure get_user_pages() and find_vma() etc. will find 
> >>> nothing:
> >>> +     * this may be necessary for KSM.
> >>> +     */
> >>> +    mm->mmap = NULL;
> >>> +    mm->mmap_cache = NULL;
> >>> +    mm->mm_rb = RB_ROOT;
> >>> +
> >>> +    /*
> >>>       * Walk the list again, actually closing and freeing it,
> >>>       * with preemption enabled, without holding any MM locks.
> >>>       */
> >>>   
> >>
> >>
> >
> >
> 
> 

Ok, below is ugly fix for the opss..


>From 3be1ad5a9f990113e8849fa1e74c4e74066af131 Mon Sep 17 00:00:00 2001
From: Izik Eidus <ieidus@...hat.com>
Date: Mon, 15 Jun 2009 03:52:05 +0300
Subject: [PATCH] ksm: madvise-rfc: really ugly fix for the oppss bug.

This patch is just so it can run without to crush with the madvise rfc patch.

True fix for this i think is adding another list for ksm inside the mm struct.
In the meanwhile i will try to think about other way how to fix this bug.

Hugh, i hope at least now you will be able to run it without it will crush to
you.

Signed-off-by: Izik Eidus <ieidus@...hat.com>
---
 kernel/fork.c |   11 ++++++-----
 1 files changed, 6 insertions(+), 5 deletions(-)

diff --git a/kernel/fork.c b/kernel/fork.c
index e5ef58c..771b89a 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -484,17 +484,18 @@ void mmput(struct mm_struct *mm)
 {
 	might_sleep();
 
+	spin_lock(&mmlist_lock);
 	if (atomic_dec_and_test(&mm->mm_users)) {
+		if (!list_empty(&mm->mmlist))
+			list_del(&mm->mmlist);
+		spin_unlock(&mmlist_lock);
 		exit_aio(mm);
 		exit_mmap(mm);
 		set_mm_exe_file(mm, NULL);
-		if (!list_empty(&mm->mmlist)) {
-			spin_lock(&mmlist_lock);
-			list_del(&mm->mmlist);
-			spin_unlock(&mmlist_lock);
-		}
 		put_swap_token(mm);
 		mmdrop(mm);
+	} else {
+		spin_unlock(&mmlist_lock);
 	}
 }
 EXPORT_SYMBOL_GPL(mmput);
-- 
1.5.6.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/