linux-kernel - Re: [PATCH v2 2/4] cxl/mem: Fix synchronization mechanism for device removal vs ioctl operations

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <20210330220940.GB2356281@nvidia.com>
Date:   Tue, 30 Mar 2021 19:09:40 -0300
From:   Jason Gunthorpe <jgg@...dia.com>
To:     Dan Williams <dan.j.williams@...el.com>
Cc:     linux-cxl@...r.kernel.org,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
        Vishal L Verma <vishal.l.verma@...el.com>,
        "Weiny, Ira" <ira.weiny@...el.com>,
        "Schofield, Alison" <alison.schofield@...el.com>
Subject: Re: [PATCH v2 2/4] cxl/mem: Fix synchronization mechanism for device
 removal vs ioctl operations

On Tue, Mar 30, 2021 at 02:00:43PM -0700, Dan Williams wrote:

> Ok, so another case where I agree the instability exists but does not
> matter in practice in this case because the cxl_memdev_ioctl() read
> side is prepared for the state change to leak into the down_read()
> section. There's no meaningful publish/unpublish race that READ_ONCE()
> resolves here. 

Generally READ_ONCE is about protecting the control flow after the
*compiler*

Say, I have 

    if (a) {
      do_x();
      do_y();
    }

The compiler *might* transform it so I get:

    if (a)
       do_x()
    if (a)
       do_y()

Now if 'a' is an unstable data race I have a crazy problem: the basic
invarient that x and y are always done together is broken and I can
have x done without y. (or y done without x if the CPU runs it out of
order!)

But I can't see this at all from the code, it just runs wrong under
certain races with certain compilers.

READ_ONCE prevents this kinds of compiler transform, this is what it
is for. It is why in modern times it has become expected to always
mark unlocked unstable data access with these helpers, or the RCU
specific variants.

It is not reasoning about happens before/happens after kind of ideas,
it is about "the compiler assumes the memory doesn't change and if we
break that assumption the compiled result might not work sanely"

> down_write(&cxl_memdev_rwsem);
> cxlmd->cxlm = cxlm;
> up_write(&cxl_memdev_rwsem);
> cdev_device_del(...);
 
> ...and then no unstable state of the ->cxlm reference leaks into the
> read-side that does:
> 
> down_read(&cxl_memdev_rwsem);
> if (!cxlmd->cxlm) {
>     up_read(&cxl_memdev_rwsem)
>     return -ENXIO;
> }
> ...
> up_read(&cxl_memdev_rwsem);

Sure this construction is easy to understand and validate, if a rwsem
is OK performance wise. We can do it after the cdev_device_del. Here
the cxlm is the protected data and it is always accessed under
lock. 

The rwsem can be transformed to SRCU by marking cxlm with the __rcu
type annotation and using srcu_dereference (as a READ_ONCE wrapper)
and rcu_assign_pointer (as a WRITE_ONCE wrapper) to manipulte the cxlm

Jason