linux-kernel - Re: [xfs-masters] Re: 2.6.22-rc1-mm1

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20070517084135.GA8510@lst.de>
Date:	Thu, 17 May 2007 10:41:35 +0200
From:	Christoph Hellwig <hch@....de>
To:	xfs-masters@....sgi.com
Cc:	Andrew Morton <akpm@...ux-foundation.org>,
	Michal Piotrowski <michal.k.k.piotrowski@...il.com>,
	linux-kernel@...r.kernel.org
Subject: Re: [xfs-masters] Re: 2.6.22-rc1-mm1

On Thu, May 17, 2007 at 12:06:00PM +1000, David Chinner wrote:
> > static inline int put_page_testzero(struct page *page)
> > {
> > 	VM_BUG_ON(atomic_read(&page->_count) == 0);
> > 	return atomic_dec_and_test(&page->_count);
> > }
> 
> I haven't seen that one. I expect that it will be the noaddr buffer allocation
> changes that have triggered this...

Yes.   xfs_buf_get_noaddr calls xfs_buf_free to free a buffer when
something fails.  But this is wrong - we want to call xfs_buf_deallocate
before we setup the page list, and if a page allocation fails we want to
do out own freeing of just the pages we allocated and call
_xfs_buf_free_pages.  Currently we do our own freeing _and_ call
xfs_buf_free which leads to this double free.


Signed-off-by: Christoph Hellwig <hch@....de>


Index: linux-2.6/fs/xfs/linux-2.6/xfs_buf.c
===================================================================
--- linux-2.6.orig/fs/xfs/linux-2.6/xfs_buf.c	2007-05-17 09:34:44.000000000 +0200
+++ linux-2.6/fs/xfs/linux-2.6/xfs_buf.c	2007-05-17 09:36:53.000000000 +0200
@@ -792,8 +792,9 @@ xfs_buf_get_noaddr(
  fail_free_mem:
  	while (--i >= 0)
 		__free_page(bp->b_pages[i]);
+	_xfs_buf_free_pages(bp);
  fail_free_buf:
-	xfs_buf_free(bp);
+	xfs_buf_deallocate(bp);
  fail:
 	return NULL;
 }

> 
> > > [ 6666.719690] invalid opcode: 0000 [#1]
> > > [ 6666.723397] PREEMPT SMP
> > > [ 6666.725999] Modules linked in: xfs loop pktgen ipt_MASQUERADE iptable_nat nf_nat autofs4 af_packet nf_conntrack_netbios_ns ipt_REJECT nf_conntrack_ipv4 xt_state nf_conntrack nfnetlink iptable_filter ip_tables ip6t_REJECT xt_tcpudp ip6table_filter ip6_tables x_tables ipv6 binfmt_misc thermal processor fan container nvram snd_intel8x0 snd_ac97_codec ac97_bus snd_seq_dummy snd_seq_oss snd_seq_midi_event snd_seq snd_seq_device snd_pcm_oss snd_mixer_oss snd_pcm evdev snd_timer snd soundcore intel_agp agpgart snd_page_alloc i2c_i801 ide_cd cdrom rtc unix
> > > [ 6666.776026] CPU:    0
> > > [ 6666.776027] EIP:    0060:[<c01693ec>]    Not tainted VLI
> > > [ 6666.776028] EFLAGS: 00010202   (2.6.22-rc1-mm1 #3)
> > > [ 6666.788519] EIP is at put_page+0x44/0xee
> > > [ 6666.792491] eax: 00000001   ebx: c549f728   ecx: c04b27e0   edx: 00000001
> > > [ 6666.799345] esi: 00000000   edi: 00000080   ebp: d067e9e0   esp: d067e9c8
> > > [ 6666.806208] ds: 007b   es: 007b   fs: 00d8  gs: 0033  ss: 0068
> > > [ 6666.812104] Process mount (pid: 9419, ti=d067e000 task=d00a4070 task.ti=d067e000)
> > > [ 6666.819486] Stack: d8980180 00000080 d067e9f0 d8980180 00000000 00000080 d067e9f0 fdc8eda3
> > > [ 6666.828103]        fffffffc d8980180 d067ea20 fdc8f7ff fdc9b425 fdc96e5c 00080000 00000000
> > > [ 6666.836635]        c549dfd0 00000200 ffffffff cd44b8e0 00002160 cd44b8e0 d067ea30 fdc78937
> > > [ 6666.845253] Call Trace:
> > > [ 6666.847939]  [<fdc8eda3>] xfs_buf_free+0x41/0x61 [xfs]
> > > [ 6666.853247]  [<fdc8f7ff>] xfs_buf_get_noaddr+0x10c/0x118 [xfs]
> > > [ 6666.859231]  [<fdc78937>] xlog_get_bp+0x65/0x69 [xfs]
> 
> Yeah - that trace implies a memory allocation failure when allocating
> log buffer pages and the cleanup looks like it does a double free
> of the pages that got allocated. Patch attached below that should fix
> this problem.
> 
> > > [ 6667.271984] XFS: Filesystem loop1 has duplicate UUID - can't mount
> > >
> > > ...
> > >
> > > [ 6670.074487] XFS: Filesystem loop1 has duplicate UUID - can't mount
> > > [ 6670.240395] XFS: Filesystem loop1 has duplicate UUID - can't mount
> > > [ 6670.350305] XFS: Filesystem loop1 has duplicate UUID - can't mount
> > > [ 6670.458773] XFS: Filesystem loop1 has duplicate UUID - can't mount
> 
> I assume that the thread doing the mount got killed by the BUG and so the
> normal error handling path on log mount failure was not executed and hence the
> uuid for the filesystem never got removed from the table used to detect
> multiple mounts of the same filesystem....
> 
> Cheers,
> 
> Dave.
> -- 
> Dave Chinner
> Principal Engineer
> SGI Australian Software Group
> 
> ---
>  fs/xfs/linux-2.6/xfs_buf.c |   21 +++++++++++++--------
>  1 file changed, 13 insertions(+), 8 deletions(-)
> 
> Index: 2.6.x-xfs-new/fs/xfs/linux-2.6/xfs_buf.c
> ===================================================================
> --- 2.6.x-xfs-new.orig/fs/xfs/linux-2.6/xfs_buf.c	2007-05-11 16:03:26.000000000 +1000
> +++ 2.6.x-xfs-new/fs/xfs/linux-2.6/xfs_buf.c	2007-05-17 11:53:40.293585132 +1000
> @@ -323,9 +323,16 @@ xfs_buf_free(
>  		for (i = 0; i < bp->b_page_count; i++) {
>  			struct page	*page = bp->b_pages[i];
>  
> -			if (bp->b_flags & _XBF_PAGE_CACHE)
> +			/* handle noaddr allocation failure case */
> +			if (!page)
> +				break;
> +
> +			if (bp->b_flags & _XBF_PAGE_CACHE) {
>  				ASSERT(!PagePrivate(page));
> -			page_cache_release(page);
> +				page_cache_release(page);
> +			} else {
> +				__free_page(page);
> +			}
>  		}
>  		_xfs_buf_free_pages(bp);
>  	}
> @@ -766,6 +773,8 @@ xfs_buf_get_noaddr(
>  		goto fail;
>  	_xfs_buf_initialize(bp, target, 0, len, 0);
>  
> +	bp->b_flags |= _XBF_PAGES;
> +
>  	error = _xfs_buf_get_pages(bp, page_count, 0);
>  	if (error)
>  		goto fail_free_buf;
> @@ -773,15 +782,14 @@ xfs_buf_get_noaddr(
>  	for (i = 0; i < page_count; i++) {
>  		bp->b_pages[i] = alloc_page(GFP_KERNEL);
>  		if (!bp->b_pages[i])
> -			goto fail_free_mem;
> +			goto fail_free_buf;
>  	}
> -	bp->b_flags |= _XBF_PAGES;
>  
>  	error = _xfs_buf_map_pages(bp, XBF_MAPPED);
>  	if (unlikely(error)) {
>  		printk(KERN_WARNING "%s: failed to map pages\n",
>  				__FUNCTION__);
> -		goto fail_free_mem;
> +		goto fail_free_buf;
>  	}
>  
>  	xfs_buf_unlock(bp);
> @@ -789,9 +797,6 @@ xfs_buf_get_noaddr(
>  	XB_TRACE(bp, "no_daddr", len);
>  	return bp;
>  
> - fail_free_mem:
> -	while (--i >= 0)
> -		__free_page(bp->b_pages[i]);
>   fail_free_buf:
>  	xfs_buf_free(bp);
>   fail:
> 
---end quoted text---
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/