lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20071017165949.GF15552@kernel.dk>
Date:	Wed, 17 Oct 2007 18:59:53 +0200
From:	Jens Axboe <jens.axboe@...cle.com>
To:	Linus Torvalds <torvalds@...ux-foundation.org>
Cc:	Ingo Molnar <mingo@...e.hu>, linux-kernel@...r.kernel.org,
	Andrew Morton <akpm@...ux-foundation.org>
Subject: Re: [bug] block subsystem related crash with latest -git

On Wed, Oct 17 2007, Linus Torvalds wrote:
> 
> 
> On Wed, 17 Oct 2007, Ingo Molnar wrote:
> > 
> > Jens, just got this crash on a testbox:
> 
> The code in question is:
> 
>      mov    %edx,0xc(%esp)
>      mov    (%ebx),%edi
>      mov    %edi,%edx
>      sub    %eax,%edx
>      mov    %edx,%eax
>      sar    $0x5,%eax
>      shl    $0xc,%eax
>      add    0x8(%ebx),%eax
>      cmp    %eax,0xc(%esp)
>      je     +126
>      mov    0x10(%esi),%eax	<----- Oops
>      lea    0x10(%esi),%edx
>      test   $0x1,%al
>      jne    +76
>      mov    %edi,(%esi)
>      mov    %ebp,0xc(%esi)
>      mov    0x8(%ebx),%eax
>      mov    %eax,0x4(%esi)
> 
> 
> and it looks like %esi is overflowing from one page to the next one, ie:
> 
> 	BUG: unable to handle kernel paging request at virtual address 7ca76000
> 	ESI: 7ca75ff0
> 
> and you caught this thanks to page-alloc debugging again.
> 
> I think I can match that up with the source code: that's "sg_next()". It's 
> doing:
> 
>         sg++;
> 
>         if (unlikely(sg_is_chain(sg)))
>                 sg = sg_chain_ptr(sg);
> 
>         return sg;
> 
> and the oopsing instruction is that load of "sg->page" in the assembly 
> code:
> 
> 	mov    0x10(%esi),%eax		# %eax = sg->page
> 	lea    0x10(%esi),%edx		# %edx = sg+1;
> 	test   $0x1,%al			# if (unlikely(sg_is_chain()))
> 	jne    +76
> 
> Jens?

Yep, that's what I came up with as well - I asked Ingo for a dump in
private, but ended up just using ksymoops to decode the line.

The way blk_rq_map_sg() operates is that it ends up doing a

        next_sg = sg_next(sg);

even though sg may be the last entry. Perhaps this is crapping out,
although if sg is a valid address, then sg + 1 should be as well.
next_sg may end up being crap, in fact it will, but we'll never use that
unless there are more entries to fill. And if there is, then both sg and
next_sg were valid.

So nothing in for-linus should fix it, I'll try and come up with an
alternate way to assign next_sg so it's always valid.

-- 
Jens Axboe

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ