lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20130219091931.GB21945@quack.suse.cz>
Date:	Tue, 19 Feb 2013 10:19:31 +0100
From:	Jan Kara <jack@...e.cz>
To:	Li Zefan <lizefan@...wei.com>
Cc:	linux-fsdevel@...r.kernel.org, LKML <linux-kernel@...r.kernel.org>,
	Ext4 Developers List <linux-ext4@...r.kernel.org>,
	Jan Kara <jack@...e.cz>, Theodore Ts'o <tytso@....edu>,
	Andrew Morton <akpm@...ux-foundation.org>, andi@...stfloor.org,
	Wuqixuan <wuqixuan@...wei.com>,
	Al Viro <viro@...IV.linux.org.uk>, gregkh@...uxfoundation.org
Subject: Re: [RFC][PATCH] vfs: always protect diretory file->fpos with
 inode mutex

On Tue 19-02-13 09:22:40, Li Zefan wrote:
> There's a long long-standing bug...As long as I don't know when it dates
> from.
> 
> I've written and attached a simple program to reproduce this bug, and it can
> immediately trigger the bug in my box. It uses two threads, one keeps calling
> read(), and the other calling readdir(), both on the same directory fd.
  So the fact that read() or even write() to fd opened O_RDONLY has *any*
effect on f_pos looks really unexpected to me. I think we really should
have there:
	if (ret >= 0)
		file_pos_write(...);
  That would solve problems with read() and write() on directories for
pretty much every filesystem since the first usually returns -EISDIR and
the second -EBADF.

> When I ran it on ext3 (can be replaced with ext2/ext4) which has _dir_index_
> feature disabled, I got this:
> 
> EXT3-fs error (device loop1): ext3_readdir: bad entry in directory #34817: rec_len is smaller than minimal - offset=993, inode=0, rec_len=0, name_len=0
> EXT3-fs error (device loop1): ext3_readdir: bad entry in directory #34817: rec_len is smaller than minimal - offset=1009, inode=0, rec_len=0, name_len=0
> EXT3-fs error (device loop1): ext3_readdir: bad entry in directory #34817: rec_len is smaller than minimal - offset=993, inode=0, rec_len=0, name_len=0
> EXT3-fs error (device loop1): ext3_readdir: bad entry in directory #34817: rec_len is smaller than minimal - offset=1009, inode=0, rec_len=0, name_len=0
> ...
> 
> If we configured errors=remount-ro, the filesystem will become read-only.
> 
> SYSCALL_DEFINE3(read, unsigned int, fd, char __user *, buf, size_t, count)
> {
> 	...
> 		loff_t pos = file_pos_read(file);
> 		ret = vfs_read(file, buf, count, &pos);
> 		file_pos_write(file, pos);
> 		fput_light(file, fput_needed);
> 	...
> }
> 
> While readdir() is protected with i_mutex, f_pos can be changed without
> any locking in various read()/write() syscalls, which leads to this bug.
> 
> What makes things worse is Andi removed i_mutex from generic_file_llseek,
> so you can trigger the same bug by replacing read() with lseek() in the
> test program.
  Yes, and here I'd say it's a filesystem issue. If filesystem needs f_pos
changed only under i_mutex, it should use default_llseek() or get the mutex
itself. That's what the callback is for. We shouldn't unnecessarily impose
the i_mutex restriction on llseek on a directory for every filesystem.

								Honza
-- 
Jan Kara <jack@...e.cz>
SUSE Labs, CR
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists