lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAHk-=wiJ0xo2_aqVCoJHnO_AYP=cy1E8Pk5Vxb13+nFastAFEQ@mail.gmail.com>
Date:   Thu, 10 Aug 2023 22:53:27 -0700
From:   Linus Torvalds <torvalds@...ux-foundation.org>
To:     Kent Overstreet <kent.overstreet@...ux.dev>
Cc:     "Darrick J. Wong" <djwong@...nel.org>,
        linux-kernel@...r.kernel.org, linux-fsdevel@...r.kernel.org,
        linux-bcachefs@...r.kernel.org, dchinner@...hat.com,
        sandeen@...hat.com, willy@...radead.org, josef@...icpanda.com,
        tytso@....edu, bfoster@...hat.com, jack@...e.cz,
        andreas.gruenbacher@...il.com, brauner@...nel.org,
        peterz@...radead.org, akpm@...ux-foundation.org,
        dhowells@...hat.com, snitzer@...nel.org, axboe@...nel.dk
Subject: Re: [GIT PULL] bcachefs

On Thu, 10 Aug 2023 at 22:29, Kent Overstreet <kent.overstreet@...ux.dev> wrote:
>
> On Thu, Aug 10, 2023 at 10:20:22PM -0700, Linus Torvalds wrote:
> > If it's purely "umount doesnt' succeed because the filesystem is still
> > busy with cleanups", then things are much better.
>
> That's exactly it. We have various tests that kill -9 fio and then
> umount, and umount spuriously fails.

Well, it sounds like Jens already has some handle on at least one
io_uring shutdown case that didn't wait for completion.

At the same time, a random -EBUSY is kind of an expected failure in
real life, since outside of strictly controlled environments you could
easily have just some entirely unrelated thing that just happens to
have looked at the filesystem when you tried to unmount it.

So any real-life use tends to use umount in a (limited) loop. It might
just make sense for the fsstress test scripts to do the same
regardless.

There's no actual good reason to think that -EBUSY is a hard error. It
very much can be transient.

In fact, I have this horrible flash-back memory to some auto-expiry
scripts that used to do the equivalent of "umount -a -t autofs" every
minute or so as a horrible model for expiring things, happy and secure
in the knowledge that if the filesystem was still in active use, it
would just fail.

So may I suggest that even if the immediate issue ends up being sorted
out, just from a robustness standpoint the "consider EBUSY a hard
error" seems to be a mistake.

Transient failures are pretty much expected, and not all of them are
necessarily kernel-related (ie think "concurrent updatedb run" or any
number of other possibilities).

          Linus

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ