[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <a145fc68-9b0a-9794-48d2-b7ad79116833@huawei.com>
Date: Tue, 2 Jan 2024 18:08:10 +0800
From: Zhihao Cheng <chengzhihao1@...wei.com>
To: Richard Weinberger <richard@....at>
CC: david oberhollenzer <david.oberhollenzer@...ma-star.at>, Miquel Raynal
<miquel.raynal@...tlin.com>, Sascha Hauer <s.hauer@...gutronix.de>, Tudor
Ambarus <Tudor.Ambarus@...aro.org>, linux-kernel
<linux-kernel@...r.kernel.org>, linux-mtd <linux-mtd@...ts.infradead.org>
Subject: Re: [PATCH RFC 00/17] ubifs: Add filesystem repair support
在 2023/12/30 5:08, Richard Weinberger 写道:
>> Second, you concern odd/incomplete files are recovered just like I
>> metioned in documentation(Limitations section), which still make
>> application failed because the recovered file lost data or deleted file
>> is recovered. So you suggested me that make a userspace fsck tool, and
>> fsck can telll user which file is data lost, which file is recovered
>> after deletion.
>>
>> The difficulty comes from second point, how does fsck know a file is
>> recovered incomplete or old. Whether the node is existing, it is judged
>> by TNC, but TNC could be damaged like I descibed in above. Do you have
>> any ideas?
> That's the problem what all fsck tools have in common.
> The best we can do is offering safe and dangerous repair modes
> plus a good repair report.
>
I come up with another way to implement fsck.ubifs:
There are three modes:
1. common mode(no options): Ask user whether to fix as long as a problem
is detected.
2. safe mode(-a option): Auto repair as long as no data/files lost(eg.
nlink, isize, xattr_cnt, which can be corrected without dropping nodes),
otherwise returns error code.
3. dangerous mode(-y option): Fix is always successful, unless
superblock is corrupted. There are 2 situations:
a) TNC is valid: fsck will print which file is dropped and which
file's data is dropped
b) TNC is invalid: fsck will scan all nodes without referencing TNC,
same as this patchset does
Q1: How do you think of this method?
Q2: Mode 1, 2 and 3(a) depend on journal replaying, I found
xfs_repair[1] should be used after mounting/unmounting xfs (Let kernel
replay journal), if UBIFS does so, there is no need to copy recovery
subsystem into userspace, but user has to mount/unmount before doing
fsck. I found e2fsck has copied recovery code into userspace, so it can
do fsck without mounting/unmounting. If UBIFS does so, journal replaying
will update TNC and LPT, please reference Q3(1). Which method do you
suggest?
Q3: If fsck drops or updates a node(eg. dentry lost inode, corrected
inode) in mode 1,2 and 3(a), TNC/LPT should be updated. There are two
ways updating TNC and LPT:
1) Like kernel does, which means that mark dirty TNC/LPT for
corresponding branches and do_commit(). It will copy much code into
userspace, eg. tnc.c, lpt.c, tnc_commit.c,
lpt_commit.c. I fear there exists risks. For example, there is no space
left for new index nodes, gc should be performed? If so, gc/lpt gc code
should be copied too.
2) Rebuild new TNC/LPT based on valid nodes. This way is simple, but
old good TNC could be corrupted, it means that powercut during fsck may
let UBIFS image must be repaired in mode 3(b) but it could be repaired
in mode 2\3(a) before invoking fsck.
Which way is better?
[1]
https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/8/html/managing_file_systems/checking-and-repairing-a-file-system__managing-file-systems#proc_repairing-an-xfs-file-system-with-xfs_repair_checking-and-repairing-a-file-system
> Long story short, I'm not opposed to the idea, I just want to make
> sure that this new tool or feature is not used blindly, since
> it cannot do magic.
Powered by blists - more mailing lists