linux-kernel - Re: [PATCH 1/3] iio: light: Add driver for ap3216c

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAGngYiVGjPSnVfnJhR+0YJg+qOzMQfJM+Cd9ZRe===9pk1_3pA@mail.gmail.com>
Date:   Tue, 12 Feb 2019 23:40:13 -0500
From:   Sven Van Asbroeck <thesven73@...il.com>
To:     Jonathan Cameron <jic23@...nel.org>
Cc:     Robert Eshleman <bobbyeshleman@...il.com>,
        Hartmut Knaack <knaack.h@....de>,
        Lars-Peter Clausen <lars@...afoo.de>,
        Peter Meerwald-Stadler <pmeerw@...erw.net>,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
        linux-iio@...r.kernel.org
Subject: Re: [PATCH 1/3] iio: light: Add driver for ap3216c

On Tue, Feb 12, 2019 at 3:47 PM Jonathan Cameron <jic23@...nel.org> wrote:
>
>
> Good question on whether it is guaranteed to read in increasing register
> order (I didn't actually check the addresses are in increasing order
> but assume they are or you would have pointed that out ;)
>
> That strikes me as behaviour that should probably be documented as long
> as it is true currently.
>

Looking at the datasheet closely... this chip doesn't seem to support
bulk/raw accesses. At least, it's not documented. So maybe we should
steer clear of that, and tell the regmap core, via .use_single_read
and .use_single_write in regmap_config.

So once these flags are set, we could call regmap_bulk_read() on the
LO/HI combo registers, and it would probably work. But do we really
want to? The LO register, when read, cause side-effects
in the corresponding HI register. That's weird / unexpected for developers.
Maybe it makes sense to make this explicit, not implicit. In addition,
bulk_read() behaviour changes in the future may break the register
reads ?

> >
>
> If we clear just the right one, (which I think we can do from
> the datasheet
> "1: Software clear after writing 1 into address 0x01 each bit#"
> However the code isn't writing a 3 in that clear, so I'm not
> sure if the datasheet is correct or not...
>
> and it is a level interrupt (which I think it is?) then we would
> be safe against this miss.
>
> If either we can only globally clear or it's not a level interrupt
> there isn't much we can do to avoid a miss, it's just a bad hardware
> design.
>

I think we're in luck, and this would work ! Note 1 on page 12 of the
datasheet:

"Note1. The INT pin will be set low and set INT status bit when ALS or
PS or (ALS+PS) interrupt event occurrence.
User can clear INT bit and individual status bits when reading the
register 0xD(ALS) , 0xF(PS) and 0xD+0xF(ALS+PS) respectively."

>
>

I'm sorry to wear out your patience, but I think there are more
synchronization issues lurking here.

The iio_push_event(THRESH) tells interested userspace processes:
"hey, maybe you want to check if a threshold is crossed", right?
Is my understanding correct?

If so, I think it's possible that a threshold is crossed, without
userspace knowing. To see why, let's look at the handler again:

static irqreturn_t ap3216c_event_handler(int irq, void *p)
{
        /* imagine ALS interrupt came in */
        regmap_read(data->regmap, AP3216C_INT_STATUS, &status);
        if (status & als) iio_push_event(LIGHT);

        /* imagine schedule happens here */
        msleep(...);
        /* while we are not running, userspace reacts to the event,
         * reads the new ALS value.
         * Next, imagine a _new_ ALS interrupt comes in, while we are
         * still sleeping. the irq does not get re-scheduled, as it's
         * still running !
         * Next, we proceed:
        */
        ap3216c_clear_als(data);
        /* at this point there has been a new ALS interrupt without
         * userspace knowing about it.
         */
}

The _chance_ of this happening is very low, as ALS/PS events are quite
widely spaced. But we're not an RTOS. Question is: do we want to take
the risk? Idk, perhaps it's ok to trade simplicity for the low chance of
missing a threshold.

+static int ap3216c_write_event_config(struct iio_dev *indio_dev,
+                                    const struct iio_chan_spec *chan,
+                                    enum iio_event_type type,
+                                    enum iio_event_direction dir, int state)
+{
+       struct ap3216c_data *data = iio_priv(indio_dev);
+
+       switch (chan->type) {
+       case IIO_LIGHT:
+               data->als_thresh_en = state;
+               return 0;
+
+       case IIO_PROXIMITY:
+               data->prox_thresh_en = state;
+               return 0;
+
+       default:
+               return -EINVAL;
+       }

I think this may not work as intended. One thread (userspace) writes
a variable, another thread (threaded irq handler) checks it. but there
is no explicit or implicit memory barrier. So when userspace activates
thresholding, it may take a long time for the handler to 'see' it !

One way to guarantee that the irq handler 'sees' this immediately
is to grab a mutex, which issues an implicit memory barrier.

>
>

Allow me to suggest a simple, straightforward way to make all the above
issues go away (if indeed they are issues) :

First, disable fine-grained regmap locking, by setting .disable_locking
in regmap_config.

Next, read ALS and PS _exclusively_ in the irq handler, guard it with
a mutex:

static irqreturn_t ap3216c_event_handler(int irq, void *p)
{
    int ret = IRQ_NONE;

    mutex_lock(&data->m);
    regmap_read(data->regmap, AP3216C_INT_STATUS, &status);
    if (status & als) {
        /* mutex m ensures LO and HI are read non-interleaved */
        regmap_read(data->regmap, ALS_LO, &als_lo);
        regmap_read(data->regmap, ALS_HI, &als_hi);
        /* mutex m's memory barrier lets userspace 'see' data->als change */
        data->als = als_lo | (als_hi<<8);
        /* mutex m's memory barrier lets us 'see' do_thresholding change */
        if (data->do_thresholding)
            iio_push_event();
        /* mutex m prevents userspace from reading the cached value
         * in between iio_push_event() and als_clear(), which means
         * userspace never 'misses' interrupts.
         */
        als_clear(data);
        ret = IRQ_HANDLED;
    }
    ( ... ps case left out for brevity ... )
    mutex_unlock(&data->m);

    return ret;
}

Now, ap3216c_read_raw becomes:

ap3216c_read_raw()
{
    mutex_lock(&data->m);
    switch (mask) {
        IIO_CHAN_INFO_PROCESSED:
            switch (chan->type) {
                case IIO_LIGHT:
                    *val = some_function_of(data->als);
            }
    }
    mutex_unlock(&data->m);
}

and ap3216c_write_event_config becomes:

ap3216c_write_event_config()
{
    mutex_lock(&data->m);
    switch (chan->type) {
       case IIO_LIGHT:
               data->als_thresh_en = state;
               goto out;
    }
out:
    mutex_unlock(&data->m);
}

In fact, we'd need mutex_lock around any function that touches the
regmap, cached data, or threshold enable/disable flags.