lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20210813123434.GB11955@quack2.suse.cz>
Date:   Fri, 13 Aug 2021 14:34:34 +0200
From:   Jan Kara <jack@...e.cz>
To:     Theodore Ts'o <tytso@....edu>
Cc:     Jan Kara <jack@...e.cz>, linux-ext4@...r.kernel.org
Subject: Re: [PATCH 3/5] ext4: Speedup ext4 orphan inode handling

On Thu 12-08-21 11:01:34, Theodore Ts'o wrote:
> On Wed, Aug 11, 2021 at 12:19:13PM +0200, Jan Kara wrote:
> > +static int ext4_orphan_file_del(handle_t *handle, struct inode *inode)
> > +{
> > +	struct ext4_orphan_info *oi = &EXT4_SB(inode->i_sb)->s_orphan_info;
> > +	__le32 *bdata;
> > +	int blk, off;
> > +	int inodes_per_ob = ext4_inodes_per_orphan_block(inode->i_sb);
> > +	int ret = 0;
> > +
> > +	if (!handle)
> > +		goto out;
> > +	blk = EXT4_I(inode)->i_orphan_idx / inodes_per_ob;
> > +	off = EXT4_I(inode)->i_orphan_idx % inodes_per_ob;
> > +	if (WARN_ON_ONCE(blk >= oi->of_blocks))
> > +		goto out;
> > +
> > +	ret = ext4_journal_get_write_access(handle, inode->i_sb,
> > +				oi->of_binfo[blk].ob_bh, EXT4_JTR_ORPHAN_FILE);
> > +	if (ret)
> > +		goto out;
> 
> If ext4_journal_get_write_access() fails, we effectively drop the
> inode from the orphan list (as far as the in-memory inode is
> concerned), although the inode will still be listed in the orphan
> file.  This can be really unfortunate since if the inode gets
> reallocated for some other purpose, since its inode number is left in
> the orphan block, on the next remount, this could lead to data loss.
> 
> In the orphan list code, we leave the inode on the linked list, which
> is not great, since that will prevent the inode from being freed, but
> at least we're keeping the in-memory and on-disk state in sync and we
> avoid the data loss scenario when the inode gets reused.

Actually, in the orphan list code, we leave the inode in the on-disk list
but remove it from the in-memory list - see how
list_del_init(&ei->i_orphan) is called very early in ext4_orphan_del(). The
reason for this unconditional deletion is that if we do not remove the
inode from the in-memory orphan list, the filesystem will complain and
corrupt memory on unmount.

Also note that leaving inode in the on-disk orphan list actually does no
serious harm. Because the orphan cleanup code just checks i_nlink and
i_disksize and truncates inode down to current i_disksize and removes inode
completely if i_nlink is 0. So even if an inode on the orphan list gets
reused, orphan cleanup will just do nothing for it. So the worst problem
that will likely happen is that on-disk orphan linked list becomes
corrupted but there's no data loss AFAICT.

Is it clearer now or am I missing something?

								Honza
-- 
Jan Kara <jack@...e.com>
SUSE Labs, CR

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ