lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <m1ve95ctuc.fsf@ebiederm.dsl.xmission.com>
Date:	Wed, 17 Oct 2007 15:30:19 -0600
From:	ebiederm@...ssion.com (Eric W. Biederman)
To:	Chris Mason <chris.mason@...cle.com>
Cc:	Christian Borntraeger <borntraeger@...ibm.com>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Nick Piggin <nickpiggin@...oo.com.au>, linux-mm@...ck.org,
	linux-kernel@...r.kernel.org,
	Martin Schwidefsky <schwidefsky@...ibm.com>,
	"Theodore Ts'o" <tytso@....edu>, stable@...nel.org
Subject: Re: [PATCH] rd: Mark ramdisk buffers heads dirty

Chris Mason <chris.mason@...cle.com> writes:

>> Thinking about it.  I don't believe anyone has ever intentionally built
>> a filesystem tool that depends on being able to modify a file systems
>> metadata buffer heads while the filesystem is running, and doing that
>> would seem to be fragile as it would require a lot of cooperation
>> between the tool and the filesystem about how the filesystem uses and
>> implement things.
>> 
>
> That's right.  For example, ext2 is doing directories in the page cache
> of the directory inode, so there's a cache alias between the block
> device page cache and the directory inode page cache.
>
>> Now I guess I need to see how difficult a patch would be to give
>> filesystems magic inodes to keep their metadata buffer heads in.
>
> Not hard, the block device inode is already a magic inode for metadata
> buffer heads.  You could just make another one attached to the bdev.
>
> But, I don't think I fully understand the problem you're trying to
> solve?


So the start:
When we write buffers from the buffer cache we clear buffer_dirty
but not PageDirty

So try_to_free_buffers() will mark any page with clean buffer_heads
that is not clean itself clean.

The ramdisk set pages dirty to keep them from being removed from the
page cache, just like ramfs.

Unfortunately when those dirty ramdisk pages get buffers on them and
those buffers all go clean and we are trying to reclaim buffer_heads
we drop those pages from the page cache.   Ouch!

We can fix the ramdisk by setting making certain that buffer_heads
on ramdisk pages stay dirty as well.  The problem is this breaks
filesystems like reiserfs and ext3 that expect to be able to make 
buffer_heads clean sometimes.

There are other ways to solve this for ramdisks, such as changing
where ramdisks are stored.  However fixing the ramdisks this way
still leaves the general problem that there are other paths to the
filesystem metadata buffers, and those other paths cause the code
to be complicated and buggy.

So I'm trying to see if we can untangle this Gordian knot, so the
code because more easily maintainable.  

To make the buffer cache a helper library instead of require
infrastructure, it looks like two things need to happen.
- Move metadata buffer heads off block devices page cache entries, 
- Communicate the mappings of data pages to block device sectors
  in a generic way without buffer heads.

How we ultimately fix the ramdisk tends to depend on how we untangle
the buffer head problem.  Right now the only simple solution is to
suppress try_to_free_buffers, which is a bit ugly.  We can also come
up with a completely separate store for the pages in the buffer cache
but if we wind up moving the metadata buffer heads anyway then that
should not be necessary.

Eric
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ