[<prev] [next>] [day] [month] [year] [list]
Message-ID: <20100113045538.GA3901@nick>
Date:	Wed, 13 Jan 2010 15:55:38 +1100
From:	Nick Piggin <npiggin@...e.de>
To:	Yongseok Koh <yongseok.koh@...sung.com>
Cc:	'Andrew Morton' <akpm@...ux-foundation.org>, gregkh@...e.de,
	vegard.nossum@...il.com, mingo@...e.hu, penberg@...helsinki.fi,
	paulmck@...ux.vnet.ibm.com, torvalds@...ux-foundation.org,
	'������' <mw78.jang@...sung.com>,
	'������' 
	<jongwook.moon@...sung.com>,
	'������' <wonyong.cho@...sung.com>,
	'����ȣ' <jh.81.lee@...sung.com>,
	'������' <heemoon.bae@...sung.com>,
	'��ȿâ' <hyochang.nam@...sung.com>,
	'������' 
	<hyunjun79.lee@...sung.com>,
	'������' 
	<dragonwine.kim@...sung.com>,
	'������' <yg76.jung@...sung.com>,
	'������' <tj.cho@...sung.com>,
	linux-kernel@...r.kernel.org
Subject: Re: I found a synchronization problem in mm/vmalloc.c
On Tue, Jan 12, 2010 at 06:32:09PM +0900, Yongseok Koh wrote:
> Sorry, Mr. Morton.
> 
> Even though it is somewhat late, I am doing cc the mailing list.
> 
> Thanks.
> 
> -----Original Message-----
> 
> On Thu, 7 Jan 2010 20:22:30 +0900
> "Yongseok Koh" <yongseok.koh@...sung.com> wrote:
> 
> > Dear all,
> > 
> > I___m Yongseok Koh in Korea.
> > 
> 
> Thanks for the report.
> 
> Please do cc a mailing list when reporting bugs so that everyone else knows
> what's going on.
> 
> > 
> > I just got a new message in linux-2.6.28.10 (plz refer to the below)
> > 
> > And, one of my colleagues found that there is a synchronization 
> > problem in mm/vmalloc.c
> > 
> >  
> > 
> > In free_unmap_area_noflush(), va->flags is marked as VM_LAZY_FREE 
> > first, and then vmap_lazy_nr is increased atomically.
> > 
> > But, in __purge_vmap_area_lazy(), while traversing of vmap_are_list, 
> > nr is counted by checking VM_LAZY_FREE is set to va->flags.
> > 
> > After counting the variable nr, kernel reads vmap_lazy_nr atomically 
> > and checks a BUG_ON condition whether nr is greater than vmap_lazy_nr.
> > 
> >  
> > 
> > The problem is that, if interrupted right after marking VM_LAZY_FREE, 
> > increment of vmap_lazy_nr can be delayed.
> > 
> > Consequently, BUG_ON condition can be met because nr is counted more 
> > than vmap_lazy_nr.
> > 
> >  
> > 
> > What I mentioned is highly probable when vmalloc/vfree are called 
> > frequently.
> > 
> > And my colleagues have verified this scenario by adding delay between 
> > marking VM_LAZY_FREE and increasing vmap_lazy_nr in 
> > free_unmap_area_noflush().
> > 
> >  
> > 
> > Am I right ?
> > 
> 
> Looks plausible to me and as far as I can tell, current code has the same
> issue.
Yes, I think it's a good catch.
 
> Wakey wakey, Nick!  What makes that BUG_ON() safe?  Not purge_lock afacit?
No I think it is a bug. I would say that we can just get rid of the BUG_ON
now. atomic_t is signed, so it should be OK if it momentarily goes negative
(and anyway it's only used in a heuristic).
So, thanks for the report. Would you care to send a patch, or propose
another way to fix the problem?
Thanks,
Nick
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
Powered by blists - more mailing lists
 
