lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <64bb37e0711232057v611a7114i9d8ba3ef6a0d6fc2@mail.gmail.com>
Date:	Sat, 24 Nov 2007 05:57:21 +0100
From:	"Torsten Kaiser" <just.for.lkml@...glemail.com>
To:	"Alasdair G Kergon" <agk@...hat.com>,
	"Torsten Kaiser" <just.for.lkml@...glemail.com>,
	"Milan Broz" <mbroz@...hat.com>, "Ingo Molnar" <mingo@...e.hu>,
	"Andrew Morton" <akpm@...ux-foundation.org>,
	"Linux Kernel list" <linux-kernel@...r.kernel.org>
Subject: Re: 2.6.24-rc2-mm1: kcryptd vs lockdep

On Nov 24, 2007 4:49 AM, Alasdair G Kergon <agk@...hat.com> wrote:
> On Fri, Nov 23, 2007 at 11:42:36PM +0100, Torsten Kaiser wrote:
> > ... or I just don't see the bug.
>
> See my earlier post in this thread: there's a race in the write loop
> where a work struct could be used twice on the same queue.
> (Needs data structure change to fix that, which nobody has attempted
> to do yet.)

As I wrote in an earlier post:
I did see this lockdep message even with
agk-dm-dm-crypt-move-bio-submission-to-thread.patch reverted, so the
work struct is not used in the write loop.

> BTW To eliminate any internal lockdep concerns (and people say there
> should be no problem) temporarily add a second struct instead of reusing
> one on two queues.

I think, this might really be a lockdep bug, but as I'm not fluent
enough with C, please check, if my logik is correct:

The freed-locked-lock-test is the only function that uses this in lockdep.c:
static inline int in_range(const void *start, const void *addr, const void *end)
{
        return addr >= start && addr <= end;
}
This  will return true, if addr is in the range of start (including)
to end (including).

But debug_check_no_locks_freed() seems does:
const void *mem_to = mem_from + mem_len
-> mem_to is the last byte of the freed range, that fits in_range
lock_from = (void *)hlock->instance;
-> first byte of the lock
lock_to = (void *)(hlock->instance + 1);
-> first byte of the next lock, not last byte of the lock that is being checked!
(Or am I reading this wrong?)

The test is:
if (!in_range(mem_from, lock_from, mem_to) &&
                                        !in_range(mem_from, lock_to, mem_to))
                        continue;
So it tests, if the first byte of the lock is in the range that is freed ->OK
And if the first byte of the *next* lock is in the range that is freed
-> Not OK.

That would also explain the rather strange output:
=========================
[ BUG: held lock freed! ]
-------------------------
kcryptd/1022 is freeing memory
FFFF81011EBEFB00-FFFF81011EBEFB3F, with a lock still held there!
  (kcryptd){--..}, at: [<ffffffff80247dd9>] run_workqueue+0x129/0x210
2 locks held by kcryptd/1022:
 #0:  (kcryptd){--..}, at: [<ffffffff80247dd9>] run_workqueue+0x129/0x210
 #1:  (&io->work#2){--..}, at: [<ffffffff80247dd9>] run_workqueue+0x129/0x210

That claims that the lock of the *workqueue* struct, not the work
struct is getting freed!
But I'm still happily using the dm-crypt device, even 19 hours after
that message.

So my current best guess to the source of this message is, that with
the change in the ref counting it is now possible that the work struct
is really getting freed before the workqueue function returns. But as
the comment in run_workqueue() says, that is still legal.
But now the first byte of the next lock is part of the freed memory
and so the wrong "held lock freed" is triggered.

Torsten
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ