lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date:	Mon, 14 Jul 2008 10:04:30 -0400
From:	jim owens <jowens@...com>
To:	Takashi Sato <t-sato@...jp.nec.com>
CC:	mtk.manpages@...glemail.com, axboe@...nel.dk,
	linux-kernel@...r.kernel.org, dm-devel@...hat.com, xfs@....sgi.com,
	linux-ext4@...r.kernel.org, viro@...IV.linux.org.uk,
	akpm@...ux-foundation.org, pavel@...e.cz,
	linux-fsdevel@...r.kernel.org, hch@...radead.org,
	Miklos Szeredi <miklos@...redi.hu>,
	Arjan van de Ven <arjan@...radead.org>,
	Theodore Tso <tytso@....edu>,
	Dave Chinner <david@...morbit.com>
Subject: Re: [PATCH 3/3] Add timeout feature

Takashi Sato wrote:

> What is the difference between the timeout and AUTO-THAW?
> When the kernel detects a deadlock, does it occur to solve it?

TIMEOUT is a user-specified limit for the freeze.  It is
not a deadlock preventer or deadlock breaker.  The reason
it exists is:

    - middle of the night (low but not zero users)
    - cron triggers freeze and hardware snapshot
    - san is overloaded by tape copy traffic so
      hardware will take 2 hours to ack snapshot done
    - user "company president" tries to create a report
      needed for an AM meeting with bankers
    - with so few users, system will just patiently
      wait for hardware to finish
    - after 10 minutes "company president" pages
      admin, admin's boss, and "IT vice president"
      in a real unhappy mood

AUTO-THAW is simply a name for the effect of all deadlock
preventer and deadlock breaker code that the kernel has
in the freeze implementation paths... if that code would
unfreeze the filesystem.  We also implemented deadlock
preventer code that does not thaw the freeze.

None of the AUTO-THAW code is there to stop a stupid
userspace program caller of freeze.  It handles things
like "a system in our cluster is going down so we
must have this filesystem unfrozen or the whole
cluster will crash".   In places where there could be
a kernel deadlock we made it "lock-only-if-non-blocking"
and if we could not wait to retry later, the failure
to lock would trigger an immediate unfreeze.

Deadlock prevention needs code in critical paths in more
than just filesystems.  Sometimes this is as simple as
an "I can't wait on freeze" flag added to a vm-filesystem
interface.

Timers just don't work for keeping the kernel alive
because they don't trigger on resource exhaustion.

jim
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ