[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20170818134129.ubollrjtjenlfrqd@thunk.org>
Date: Fri, 18 Aug 2017 09:41:29 -0400
From: Theodore Ts'o <tytso@....edu>
To: Deepa Dinamani <deepa.kernel@...il.com>
Cc: Andreas Dilger <adilger@...ger.ca>, Arnd Bergmann <arnd@...db.de>,
Wang Shilong <wshilong@....com>,
Wang Shilong <wangshilong1991@...il.com>,
"linux-ext4@...r.kernel.org" <linux-ext4@...r.kernel.org>,
Shuichi Ihara <sihara@....com>, Li Xi <lixi@....com>,
Jan Kara <jack@...e.cz>
Subject: Re: Y2038 bug in ext4 recently_deleted() function
On Thu, Aug 17, 2017 at 06:23:26PM -0700, Deepa Dinamani wrote:
>
> I don't think dtime has widened on the disk layout for ext4 according
> to https://ext4.wiki.kernel.org/index.php/Ext4_Disk_Layout. So I am
> not sure how fixing the internal implementation would be useful until
> we do that. Is there a plan for that?
The dtime field is not visible to user; it's mostly for debugging
purposes. For debugfs we just are just using i_ctime_extra to compose
the time. (Perhaps we should be using i_mtime_extra, or the max of
the ctime, mtime, and atime extra fields; but it's not really that
important.)
The issue which Andreas pointed out is the only place where we
actually use the dtime field, and that's so we can avoid re-using a
freshly deleted inode until at least N seconds have gone by in
no-journal node. That's because if we don't, there are some
unfortunate effects that can take place if we crash and not all of the
metadata gets updated. Even after running e2fsck -fy, we can end up
having a directory or an immutable file show up where ntp or timed
expects to find a time adjustment file, or some such, that can cause
various system daemons to crash and burn because they aren't expecting
find a file at a particular pathname they own which they can't delete.
There are a number ways we could solve it; one is to just use a new
in-memory variable which can be 64-bits wide. This burns an extra 8
bytes for each inode in the inode cache, which is why we didn't do
that.
It doesn't really have to be super exact; if we actually have an inode
that avoids getting reused for 136 years (2**32 seconds), it will have
disappeared from the in-memory inode cache. We just need something
which is valid for N seconds after the deletion time. (I think we may
have upped N to a larger value on our data center kernels --- 300
seconds if I recall correctly --- because there were some edge cases
where 35 seconds wasn't enough.)
- Ted
Powered by blists - more mailing lists