lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CA+55aFw-Fed2QJCYBosGby71AsXAVKGq8mG-HYR13rEnD9V2Lg@mail.gmail.com>
Date:   Tue, 14 Aug 2018 16:16:41 -0700
From:   Linus Torvalds <torvalds@...ux-foundation.org>
To:     keith.busch@...ux.intel.com
Cc:     wnukowski@...gle.com, Jens Axboe <axboe@...com>,
        Sagi Grimberg <sagi@...mberg.me>,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
        linux-nvme <linux-nvme@...ts.infradead.org>,
        Keith Busch <keith.busch@...el.com>, yigitfiliz@...gle.com,
        Christoph Hellwig <hch@....de>
Subject: Re: [PATCH] Bugfix for handling of shadow doorbell buffer.

Guys, you're both wrong.

On Tue, Aug 14, 2018 at 03:17:35PM -0700, Michal Wnukowski wrote:
>
> With memory barrier in place, the volatile keyword around *dbbuf_ei is
> redundant.

No. The memory barrier enforces _ordering_, but it doesn't enforce
that the accesses are only done once. So when you do

>              *dbbuf_db = value;

to write to dbbuf_db, and

>    *dbbuf_ei

to read from dbbuf_ei, without the volatile the write (or the read)
could be done multiple times, which can cause serious confusion.

So the "mb()" enforces ordering, and the volatile means that the
accesses will each be done as one single access.

Two different issues entirely.

However, there's a more serious problem with your patch:

> +             /*
> +              * Ensure that the doorbell is updated before reading
> +              * the EventIdx from memory
> +              */
> +             mb();

Good comment. Except what about the other side?

When you use memory ordering rules, as opposed to locking, there's
always *two* sides to any access order. There's this "write dbbuf_db"
vs "read dbbuf_ei" ordering.

But there's the other side: what about the side that writes dbbuf_ei,
and reads dbbuf_db?

I'm assuming that's the actual controller hardware, but it needs a
comment about *that* access being ordered too, because if it isn't,
then ordering this side is pointless.

On Tue, Aug 14, 2018 at 3:56 PM Keith Busch <keith.busch@...ux.intel.com> wrote:
>
> You just want to ensure the '*dbbuf_db = value' isn't reordered, right?
> The order dependency might be more obvious if done as:
>
>         WRITE_ONCE(*dbbuf_db, value);
>
>         if (!nvme_dbbuf_need_event(READ_ONCE(*dbbuf_ei), value, old_value))
>                 return false;
>
> And 'volatile' is again redundant.

Yes, using READ_ONCE/WRITE_ONCE obviates the need for volatile, but it
does *not* impose a memory ordering.

It imposes an ordering on the compiler, but not on the CPU, so you
still want the "mb()" there (or the accesses need to be to uncached
memory or something, but then you should be using "readl()/writel()",
so that's not the case here).

                 Linus

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ