linux-ext4 - Re: [PATCH] ext4: Set file system to read-only by I/O error threshold

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <BANLkTikLrLPLgj3ykmqcZ_+KVQi85vrWAPSJHGX_hOV3nNL0sg@mail.gmail.com>
Date:	Mon, 20 Jun 2011 22:12:48 +0800
From:	Wang Shaoyan <stufever@...il.com>
To:	Jan Kara <jack@...e.cz>
Cc:	linux-ext4@...r.kernel.org,
	Wang Shaoyan <wangshaoyan.pt@...bao.com>,
	Ted Tso <tytso@....edu>
Subject: Re: [PATCH] ext4: Set file system to read-only by I/O error threshold

Thanks for your reply!
2011/6/20 Jan Kara <jack@...e.cz>:

>  Hum, if I understand your problem right, you should just mount the
> filesystem with errors=remount-ro and you will get the behavior you need.
> Or what is insufficient on that solution? Your patch surely provides more
> flexibility but is that really needed?
>

1.There are more than ten hard disks in each of our production
machine, so it is not right for
making the whole system panic, only based on one error in one harddisk.
2.There may be multiple tasks which access the same hard drive at the
same time, so it is
not ideal for changing the system to readonly, only based on one error
in one task,
while other task may be killed.

That's why we have a relaxed restrictions, only when the error counter
is grower than our
threshold, we change fs to readonly or panic.
When a system has a dozen hard drives, each hard drive is running
several tasks on time,
this feature is a real demand.

> BTW, in cluster environment (which Hadoop seems to be AFAIU) it is standard
> to mount filesystem even with stricter errors=panic so that node is taken
> off the grid as soon as some problem happens. Usually handling service
> failover is simpler than handling uncertain state after a filesystem error.
>

-- 
Wang Shaoyan
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html