[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <1246441592.8492.38.camel@pc1117.cambridge.arm.com>
Date: Wed, 01 Jul 2009 10:46:31 +0100
From: Catalin Marinas <catalin.marinas@....com>
To: Ingo Molnar <mingo@...e.hu>
Cc: Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
Andrew Morton <akpm@...ux-foundation.org>,
Linus Torvalds <torvalds@...ux-foundation.org>,
Peter Zijlstra <a.p.zijlstra@...llo.nl>,
git-commits-head@...r.kernel.org
Subject: Re: [PATCH] kmemleak: Fix scheduling-while-atomic bug
On Wed, 2009-07-01 at 11:30 +0200, Ingo Molnar wrote:
> * Catalin Marinas <catalin.marinas@....com> wrote:
>
> > > The minimal fix below removes scan_yield() and adds a
> > > cond_resched() to the outmost (safe) place of the scanning
> > > thread. This solves the regression.
> >
> > With CONFIG_PREEMPT disabled it won't reschedule during the bss
> > scanning but I don't see this as a real issue (task stacks
> > scanning probably takes longer anyway).
>
> Yeah. I suspect one more cond_resched() could be added - i just
> didnt see an obvious place for it, given that scan_block() is being
> called with asymetric held-locks contexts.
Yes, scan_block shouldn't call cond_resched(). The code is cleaner if
functions don't have too many side-effects. I can see about 1 sec of bss
scanning on an ARM board but with processor at < 500MHz and slow memory
system. On a standard x86 systems BSS scanning may not be noticeable
(and I think PREEMPT enabling is quite common these days).
Since we are at locking, I just noticed this on my x86 laptop when
running cat /sys/kernel/debug/kmemleak (I haven't got it on an ARM
board):
================================================
[ BUG: lock held when returning to user space! ]
------------------------------------------------
cat/3687 is leaving the kernel with locks still held!
1 lock held by cat/3687:
#0: (scan_mutex){+.+.+.}, at: [<c01e0c5c>] kmemleak_open+0x3c/0x70
kmemleak_open() acquires scan_mutex and unconditionally releases it in
kmemleak_release(). The mutex seems to be released as a subsequent
acquiring works fine.
Is this caused just because cat may have exited without closing the file
descriptor (which should be done automatically anyway)?
Thanks.
--
Catalin
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists