lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <CACT4Y+ZtrBeZyNeSJ_9d3DdVuP21=h7TNnOZJ_wLhLu11+qAAA@mail.gmail.com>
Date:   Wed, 16 Jan 2019 13:37:22 +0100
From:   Dmitry Vyukov <dvyukov@...gle.com>
To:     Jan Kara <jack@...e.cz>
Cc:     Tetsuo Handa <penguin-kernel@...ove.sakura.ne.jp>,
        Al Viro <viro@...iv.linux.org.uk>,
        linux-fsdevel <linux-fsdevel@...r.kernel.org>,
        LKML <linux-kernel@...r.kernel.org>,
        "Paul E. McKenney" <paulmck@...ux.ibm.com>,
        Alan Stern <stern@...land.harvard.edu>,
        Andrea Parri <andrea.parri@...rulasolutions.com>
Subject: Re: [PATCH] fs: ratelimit __find_get_block_slow() failure message.

On Wed, Jan 16, 2019 at 12:56 PM Jan Kara <jack@...e.cz> wrote:
>
> On Wed 16-01-19 12:03:27, Dmitry Vyukov wrote:
> > On Wed, Jan 16, 2019 at 11:43 AM Jan Kara <jack@...e.cz> wrote:
> > >
> > > On Wed 16-01-19 10:47:56, Dmitry Vyukov wrote:
> > > > On Fri, Jan 11, 2019 at 1:46 PM Tetsuo Handa
> > > > <penguin-kernel@...ove.sakura.ne.jp> wrote:
> > > > >
> > > > > On 2019/01/11 19:48, Dmitry Vyukov wrote:
> > > > > >> How did you arrive to the conclusion that it is harmless?
> > > > > >> There is only one relevant standard covering this, which is the C
> > > > > >> language standard, and it is very clear on this -- this has Undefined
> > > > > >> Behavior, that is the same as, for example, reading/writing random
> > > > > >> pointers.
> > > > > >>
> > > > > >> Check out this on how any race that you might think is benign can be
> > > > > >> badly miscompiled and lead to arbitrary program behavior:
> > > > > >> https://software.intel.com/en-us/blogs/2013/01/06/benign-data-races-what-could-possibly-go-wrong
> > > > > >
> > > > > > Also there is no other practical definition of data race for automatic
> > > > > > data race detectors than: two conflicting non-atomic concurrent
> > > > > > accesses. Which this code is. Which means that if we continue writing
> > > > > > such code we are not getting data race detection and don't detect
> > > > > > thousands of races in kernel code that one may consider more harmful
> > > > > > than this one the easy way. And instead will spent large amounts of
> > > > > > time to fix some of then the hard way, and leave the rest as just too
> > > > > > hard to debug so let the kernel continue crashing from time to time (I
> > > > > > believe a portion of currently open syzbot bugs that developers just
> > > > > > left as "I don't see how this can happen" are due to such races).
> > > > > >
> > > > >
> > > > > I still cannot catch. Read/write of sizeof(long) bytes at naturally
> > > > > aligned address is atomic, isn't it?
> > > >
> > > > Nobody guarantees this. According to C non-atomic conflicting
> > > > reads/writes of sizeof(long) cause undefined behavior of the whole
> > > > program.
> > >
> > > Yes, but to be fair the kernel has always relied on long accesses to be
> > > atomic pretty heavily so that it is now de-facto standard for the kernel
> > > AFAICT. I understand this makes life for static checkers hard but such is
> > > reality.
> >
> > Yes, but nobody never defined what "a long access" means. And if you
> > see a function that accepts a long argument and stores it into a long
> > field, no, it does not qualify. I bet this will come at surprise to
> > lots of developers.
>
> Yes, inlining and other optimizations can screw you.
>
> > Check out this fix and try to extrapolate how this "function stores
> > long into a long leads to a serious security bug" can actually be
> > applied to whole lot of places after inlining (or when somebody just
> > slightly shuffles code in a way that looks totally safe) that also
> > kinda look safe and atomic:
> > https://lore.kernel.org/patchwork/patch/599779/
> > So where is the boundary between "a long access" that is atomic and
> > the one that is not necessary atomic?
>
> So I tend to rely on "long access being atomic" for opaque values (no
> flags, no counters, ...). Just value that gets fetched from some global
> variable / other data structure, stored, read, and possibly compared for
> equality. I agree the compiler could still screw you if it could infer how
> that value got initially created and try to be clever about it...

So can you, or somebody else, define a set of rules that we can use to
discriminate each particular case? How can we avoid that "the compiler
could still screw you"?

Inlining is always enabled, right, so one needs to take into account
everything that's possibly can be inlined. Now or in future. And also
link-time-code generation, if we don't use it we are dropping 10% of
performance on the floor.
Also, ensuring that the code works when it's first submitted is the
smaller part of the problem. It's ensuring that it continues to work
in future what's more important. Try to imagine what amount of burden
this puts onto all developers who touch any kernel code in future.
Basically if you slightly alter local logic in a function that does
not do any loads/stores, you can screw multiple "proofs" that long
accesses are atomic. Or, you just move a function from .c file to .h.
I can bet nobody re-proofs all "long accesses are atomic" around the
changed code during code reviews, so these things break over time.
Or, even if only comparisons are involved (that you mentioned as
"safe") I see how that can actually affect compilation process. Say,
we are in the branch where 2 variables compare equal, now since no
concurrency is involved from compiler point of view, it can, say,
discard one variable and then re-load it from the other variable's
location, and say not the other variable has value that the other one
must never have. I don't have a full scenario, but that's exactly the
point. One will never see all possibilities.

It all becomes super slippery slope very quickly. And we do want
compiler to generate as fast code as possible and do all these
optimizations. And it's not that there are big objective reasons to
not just mark all concurrent accesses and stop spending large amounts
of time on these "proofs".

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ