linux-ext4 - Re: rfc: [patch] change attribute for ext3

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20060914134831.GE28663@openx1.frec.bull.fr>
Date:	Thu, 14 Sep 2006 15:48:31 +0200
From:	Alexandre Ratchov <alexandre.ratchov@...l.net>
To:	Andreas Dilger <adilger@...sterfs.com>
Cc:	Trond Myklebust <trond.myklebust@....uio.no>,
	linux-ext4@...r.kernel.org, nfsv4@...ux-nfs.org
Subject: Re: rfc: [patch] change attribute for ext3

On Thu, Sep 14, 2006 at 03:23:18AM -0600, Andreas Dilger wrote:
> On Sep 13, 2006  20:30 +0200, Alexandre Ratchov wrote:
> > On Wed, Sep 13, 2006 at 02:11:11PM -0400, Trond Myklebust wrote:
> > > On Wed, 2006-09-13 at 18:42 +0200, Alexandre Ratchov wrote:
> > > > the change attribute is a simple counter that is reset to zero on
> > > > inode creation and that is incremented every time the inode data is
> > > > modified (similarly to the "ctime" time-stamp).
> > > 
> > > I would really have preferred a full-blown 64-bit counter as per
> > > RFC3530, but I suppose we could always combine this change attribute
> > > with the high word from ctime in order to make up the NFSv4 change
> > > attribute. That should keep us safe until someone develops a ramdisk
> > > with < 1 nsecond access time.
> > 
> > do you mean something like "(ctime.tv_sec << 32) | change_attribute"? this
> > would allow 2^32 inode changes per second.
> 
> It might be preferrable, since we are depending on the ctime here anyways,
> is to combine this with the nsec-resolution ctime, and kill two birds with
> one field in the inode.
> 
> The implementation would be to update the ctime+nsec field as normal, but
> in the unlikely case that both the second+nsec ctime is the same as before
> the nsec value would be incremented by 1.  This could happen in case of
> low-resolution kernel timers, and would also handle the future case where
> the inode is modified more than once in the same nanosecond.
> 
> The other benefit is that it allows comparisons between two different
> inodes to be more meaningful, instead of just using the seconds + random
> version number.
> 
> It would be possible/desirable to make the nsec ctime field be part of the 
> small inode (using the proposed reserved field) instead of the large inode,
> since that is a requirement for working with existing ext3 filesystems.  The
> previous nsec timestamp patch would only need trivial modifications to make
> this work, just #define i_ctime_extra to be l_i_reserved1 I believe.
> 

there is something i dislike with incrementing the nsec value. The ctime is
a global (as opposed to per-inode) time reference for the file-system. And
it is expected to be globally coherent; imagine the following situation:

Within the same time-slice (with time-stamp T0, in nanoseconds), we do the
following in this order:

change file1	-> 	ctime = T0
change file2	->	ctime = T0
change file2	->	ctime = T0 + 1
change file2	->	ctime = T0 + 2
change file1	->	ctime = T0 + 1

so it appears that file2 is strictly newer than file1, which is false. So
the assumption "if ctime(file1) < ctime(file2) then file2 is newer that
file1" is no longer true.

In order to fix this, we'll need to increment a global counter, not a
pre-inode counter. It's feasable.

cheers,

-- Alexandre
-
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html