[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <c5d5575e-7df3-4e2e-9c00-d9b71959bd02@axis.com>
Date: Wed, 18 Dec 2024 11:46:20 +0100
From: Per-Daniel Olsson <perdaniel.olsson@...s.com>
To: Jonathan Cameron <jic23@...nel.org>
Cc: Lars-Peter Clausen <lars@...afoo.de>, Rob Herring <robh@...nel.org>,
Krzysztof Kozlowski <krzk+dt@...nel.org>, Conor Dooley
<conor+dt@...nel.org>, Javier Carrasco <javier.carrasco.cruz@...il.com>,
linux-iio@...r.kernel.org, linux-kernel@...r.kernel.org,
devicetree@...r.kernel.org, rickard.andersson@...s.com, kernel@...s.com
Subject: Re: [PATCH v9 2/2] iio: light: Add support for TI OPT4060 color
sensor
On 12/14/24 19:21, Jonathan Cameron wrote:
> On Wed, 11 Dec 2024 15:04:09 +0100
> Per-Daniel Olsson <perdaniel.olsson@...s.com> wrote:
>
>> Add support for Texas Instruments OPT4060 RGBW Color sensor.
>>
>> Signed-off-by: Per-Daniel Olsson <perdaniel.olsson@...s.com>
>
> Hi Per-Daniel,
>
> I think this is nearly there, but still some races or at least places
> I can't currently convince myself aren't races.
>
> I remember long ago being a wimp and failing to implement a similar dance
> in the max1363 ADC driver. That manages one more complexity in that in
> it's continuous mode if events are enabled, the data fields move.
> It still only supports one of events, sysfs read back or buffered output,
> not any combination. Maybe if I can find the hardware I'll revisit that
> one day.
>
> Thanks,
>
> Jonathan
Hi Jonathan,
Thank you for your comments. Also thank you for sharing your story about max1363,
sounds like a tricky piece of silicon. Great to hear that you think it can be
worth the time to implement concurrent use cases, I was a little worried that I
was pushing that part a little too far... :)
I have added comments below.
Thanks (and happy holidays if I don't hear back from you before),
Per-Daniel
>
>> diff --git a/drivers/iio/light/opt4060.c b/drivers/iio/light/opt4060.c
>> new file mode 100644
>> index 000000000000..4a7d970c5d7c
>> --- /dev/null
>> +++ b/drivers/iio/light/opt4060.c
>
>
>> +struct opt4060_chip {
>> + struct regmap *regmap;
>> + struct device *dev;
>> + struct iio_trigger *trig;
>> + u8 int_time;
>> + int irq;
>> + struct mutex irq_setting_lock;
>
> General rule is all locks need a comment on what data they protect
> even when they have a nice specific name. Bit advantage is it makes
> it clear what they are not designed to protect!
Done in v10.
>
>> + struct completion completion;
>> + bool thresh_event_lo_active;
>> + bool thresh_event_hi_active;
>> +};
>
>> +static void opt4060_claim_irq_setting_lock(struct opt4060_chip *chip)
>> +{
>> + if (chip->irq)
> I'm struggling to see why you'd be messing with irqs if you don't have
> any? I can't see a path in which chip->irq isn't set (which makes sense!)
> and you get here.
Good point, fixed in v10.
>
> So I think you can just move the mutex inline which avoids the mess
> of lockdep warnings etc and let's you use guard() to simplify things.
>
>> + mutex_lock(&chip->irq_setting_lock);
>> +}
>> +
>> +static void opt4060_release_irq_setting_lock(struct opt4060_chip *chip)
>> +{
>> + if (chip->irq)
>> + mutex_unlock(&chip->irq_setting_lock);
>> +}
>> +
>> +static int opt4060_set_int_state(struct opt4060_chip *chip, u32 state)
>> +{
>> + int ret;
>> + unsigned int regval;
>> +
>> + opt4060_claim_irq_setting_lock(chip);
>> + regval = FIELD_PREP(OPT4060_INT_CTRL_INT_CFG, state);
>> + ret = regmap_update_bits(chip->regmap, OPT4060_INT_CTRL,
>> + OPT4060_INT_CTRL_INT_CFG, regval);
>> + if (ret)
>> + dev_err(chip->dev, "Failed to set interrupt config\n");
>> + opt4060_release_irq_setting_lock(chip);
>> + return ret;
>> +}
>> +
>> +static int opt4060_set_continuous_mode(struct opt4060_chip *chip,
>> + bool continuous)
>> +{
>> + unsigned int reg;
>> + int ret;
>> +
>> + ret = regmap_read(chip->regmap, OPT4060_CTRL, ®);
>
> You could use a regmap_update_bits() call to simplify this like you
> do for the int_state above.
That register is a little special. It requires a real write sometimes
even is the data hasn't changed, so regmap_update_bits won't work. This
is needed for triggering one shot mode. I have renamed the function in
v10 since it don't just set continuous mode but also one shot. I have
also changed the comment in the code.
>
>
>> + if (ret < 0) {
>> + dev_err(chip->dev, "Failed to read ctrl register\n");
>> + return ret;
>> + }
>> + reg &= ~OPT4060_CTRL_OPER_MODE_MASK;
>> + if (continuous)
>> + reg |= FIELD_PREP(OPT4060_CTRL_OPER_MODE_MASK,
>> + OPT4060_CTRL_OPER_MODE_CONTINUOUS);
>> + else
>> + reg |= FIELD_PREP(OPT4060_CTRL_OPER_MODE_MASK,
>> + OPT4060_CTRL_OPER_MODE_ONE_SHOT);
>> +
>> + /* Trigger a new conversions by writing to CRTL register. */
>> + ret = regmap_write(chip->regmap, OPT4060_CTRL, reg);
>> + if (ret)
>> + dev_err(chip->dev, "Failed to set ctrl register\n");
>> + return ret;
>> +}
>> +
>> +static bool opt4060_event_active(struct opt4060_chip *chip)
>> +{
>> + return chip->thresh_event_lo_active || chip->thresh_event_hi_active;
>> +}
>> +
>> +static int opt4060_set_state_common(struct opt4060_chip *chip,
>> + bool continuous_sampling,
>> + bool continuous_irq, bool direct_mode)
>> +{
>> + int ret = 0;
>> +
>> + /* It is important to setup irq before sampling to avoid missing samples. */
>> + if (continuous_irq || !direct_mode)
>> + ret = opt4060_set_int_state(chip, OPT4060_INT_CTRL_ALL_CH);
>> + else if (direct_mode)
>> + ret = opt4060_set_int_state(chip, OPT4060_INT_CTRL_THRESHOLD);
>> + if (ret) {
>> + dev_err(chip->dev, "Failed to set irq state.\n");
>> + return ret;
>> + }
>> +
>> + if (continuous_sampling || !direct_mode || opt4060_event_active(chip))
>
> I think there may also a race around the event active check. You could have
> one event direction being enabled concurrently with the other being disabled.
> I'm not sure it matters but worth checking.
I think you might be correct even if I haven't been able to trigger the case in
my tests. I have added a guard for this in v10. If opt4060_write_event_config()
is called from several callers at the same time, the whole sequence all the way
down to setting sampling and irq will be synchronized with the mutex/guard.
>
> Side effect of either claiming direct or buffered mode is that only one
> caller can do it at a time, so that would close this race as well.
> Having said that, it's an implementation detail of the core (be it one that
> has been there a long time) so you should really have your own driver
> specific locking scheme prevent that.
I have implemented the "dance" in v10 and it seems to work well in my tests.
>
>
>> + ret = opt4060_set_continuous_mode(chip, true);
>> + else if (direct_mode)
>> + ret = opt4060_set_continuous_mode(chip, false);
>> + if (ret)
>> + dev_err(chip->dev, "Failed to set sampling state.\n");
>> + return ret;
>> +}
>> +
>> +/*
>> + * Function for setting the driver state for sampling and irq. When disabling
>> + * continuous sampling or irq, the IIO direct mode must be claimed to prevent
>> + * races with buffer enabling/disabling. In the case when the direct mode is
>> + * not possible to claim, the function will keep continuous mode. All
>> + * functions, sysfs read, events and buffer, work in continuous mode.
>> + */
>> +static int opt4060_set_driver_state(struct iio_dev *indio_dev,
>> + bool continuous_sampling,
>> + bool continuous_irq)
>> +{
>> + struct opt4060_chip *chip = iio_priv(indio_dev);
>> + bool direct_mode = false;
>> + int ret = 0;
>> +
>> + if (!iio_device_claim_direct_mode(indio_dev))
>> + direct_mode = true;
>
> Hmm. I'm dubious about this pattern. Why is it fine if the driver
> leaves buffered mode right here? I was expecting this to do
> the dance with claiming either direct mode or buffered mode.
> (with the retry loop). Direct mode that you pass into the next
> call may well be false when it should be true.
>
> Even if you can reason why that isn't a problem (and there are worse
> dances where it switches mode multiple times during your call
> of the next function to consider) I think it is easier to reason
> about if we know it is definitely not changing state until we
> release it.
"dance" implemented in v10.
>
>> +
>> + ret = opt4060_set_state_common(chip, continuous_sampling,
>> + continuous_irq, direct_mode);
>> +
>> + if (direct_mode)
>> + iio_device_release_direct_mode(indio_dev);
>> + return ret;
>> +}
>> +
>> +/*
>> + * This function is called in direct mode from the framework.
>> + */
>> +static int opt4060_trigger_set_state(struct iio_trigger *trig, bool state)
>> +{
>> + struct iio_dev *indio_dev = iio_trigger_get_drvdata(trig);
>> + struct opt4060_chip *chip = iio_priv(indio_dev);
>> + int ret = 0;
>> +
>> + return ret = opt4060_set_state_common(chip, state, state, true);
>
> return opt_set_state_common() is probably the intent.
>
>> +}
>
>> +static int opt4060_trigger_new_samples(struct iio_dev *indio_dev)
>> +{
>> + struct opt4060_chip *chip = iio_priv(indio_dev);
>> + int ret;
>> +
>> + /*
>> + * The conversion time should be 500us startup time plus the integration time
>> + * times the number of channels. An exact timeout isn't critical, it's better
>> + * not to get incorrect errors in the log. Setting the timeout to double the
>> + * theoretical time plus and extra 100ms margin.
>> + */
>> + unsigned int timeout_us = (500 + OPT4060_NUM_CHANS *
>> + opt4060_int_time_reg[chip->int_time][0]) * 2 + 100000;
>> +
>> + /* Setting the state in one shot mode with irq on each sample. */
>> + ret = opt4060_set_driver_state(indio_dev, false, true);
>> + if (ret)
>> + return ret;
>> +
>> + if (chip->irq) {
>> + reinit_completion(&chip->completion);
>> + opt4060_claim_irq_setting_lock(chip);
>> + if (wait_for_completion_timeout(&chip->completion,
>> + usecs_to_jiffies(timeout_us)) == 0) {
>> + dev_err(chip->dev, "Completion timed out.\n");
>> + opt4060_release_irq_setting_lock(chip);
>
> This is where exposing the lock directly will simplify things as you can just use
> a guard.
Fixed with guard in v10.
>
>> + return -ETIME;
>> + }
>> + opt4060_release_irq_setting_lock(chip);
>> + } else {
>> + unsigned int ready;
>> +
>> + ret = regmap_read_poll_timeout(chip->regmap, OPT4060_RES_CTRL,
>> + ready, (ready & OPT4060_RES_CTRL_CONV_READY),
>> + 1000, timeout_us);
>> + if (ret)
>> + dev_err(chip->dev, "Conversion ready did not finish within timeout.\n");
>> + }
>> + /* Setting the state in one shot mode with irq on thresholds. */
>> + ret = opt4060_set_driver_state(indio_dev, false, false);
>> +
>> + return ret;
>
> return opt4060_...
>
>> +}
>
>> +static int opt4060_write_raw_get_fmt(struct iio_dev *indio_dev,
>> + struct iio_chan_spec const *chan,
>> + long mask)
>> +{
>> + switch (mask) {
>> + case IIO_CHAN_INFO_INT_TIME:
>> + return IIO_VAL_INT_PLUS_MICRO;
> IIRC That's the default, so you don't need to provide write_raw_get_fmt,
> though no harm in doing so I guess.
Ok, wasn't aware of that being the default.
>
>> + default:
>> + return -EINVAL;
>> + }
>> +}
>
>
>> +static int opt4060_get_channel_sel(struct opt4060_chip *chip, int *ch_sel)
>> +{
>> + int ret;
>> + u32 regval;
>> +
>> + ret = regmap_read(chip->regmap, OPT4060_INT_CTRL, ®val);
>> + if (ret)
>> + dev_err(chip->dev, "Failed to get channel selection.\n");
>
> if you have garbage, not sure it's valid to update ch_sel.
>
>> + *ch_sel = FIELD_GET(OPT4060_INT_CTRL_THRESH_SEL, regval);
>> + return ret;
>> +}
>> +
>
>
>> +static int opt4060_write_event_config(struct iio_dev *indio_dev,
>> + const struct iio_chan_spec *chan,
>> + enum iio_event_type type,
>> + enum iio_event_direction dir, bool state)
>> +{
>> + int ch_sel, ch_idx = chan->scan_index;
>> + struct opt4060_chip *chip = iio_priv(indio_dev);
>> + int ret;
>> +
>> + if (chan->type != IIO_INTENSITY)
>> + return -EINVAL;
>> + if (type != IIO_EV_TYPE_THRESH)
>> + return -EINVAL;
>> +
>> + ret = opt4060_get_channel_sel(chip, &ch_sel);
>> + if (ret)
>> + return ret;
>> +
>> + if (state) {
>> + /* Only one channel can be active at the same time */
>> + if ((chip->thresh_event_lo_active ||
>> + chip->thresh_event_hi_active) && (ch_idx != ch_sel))
>
> That's a bit nasty to ready. I'd use a slightly long line and get the || pair
> on the first line.
Fixed in v10.
>
> Hmm. We've never made rules on this but some devices to fifo type
> selection if they have limitations on events enabled at the same time.
> With hindsight I think this scheme of just saying no is probably more
> user friendly.
Ok.
>
>> + return -EBUSY;
>> + if (dir == IIO_EV_DIR_FALLING)
>> + chip->thresh_event_lo_active = true;
>> + else if (dir == IIO_EV_DIR_RISING)
>> + chip->thresh_event_hi_active = true;
>> + ret = opt4060_set_channel_sel(chip, ch_idx);
>> + if (ret)
>> + return ret;
>> + } else {
>> + if (ch_idx == ch_sel) {
>> + if (dir == IIO_EV_DIR_FALLING)
>> + chip->thresh_event_lo_active = false;
>> + else if (dir == IIO_EV_DIR_RISING)
>> + chip->thresh_event_hi_active = false;
>> + }
>> + }
>> +
>> + return opt4060_set_driver_state(indio_dev, chip->thresh_event_hi_active |
>> + chip->thresh_event_lo_active, false);
> Maybe wrap it to have the | pair on lines with nothing else. They are a little bit burried
> otherwise.
Fixed in v10.
> return opt4060_set_driver_state(indio_dev,
> chip->thresh_event_hi_active |
> chip->thresh_event_lo_active,
> false);
>
>> +}
>> +
>> +static const struct iio_info opt4060_info = {
>> + .read_raw = opt4060_read_raw,
>> + .write_raw = opt4060_write_raw,
>> + .write_raw_get_fmt = opt4060_write_raw_get_fmt,
>> + .read_avail = opt4060_read_available,
>> + .read_event_value = opt4060_read_event,
>> + .write_event_value = opt4060_write_event,
>> + .read_event_config = opt4060_read_event_config,
>> + .write_event_config = opt4060_write_event_config,
>> +};
>
> Given you have option for no irq it is probably worth picking a version of this
> info structure with all the event callbacks removed. Technically it isn't
> required but it does harden the code (by crashing horribly if you call them ;)
>
Good point, I have added a separate version without those callbacks.
>
>
>> +static irqreturn_t opt4060_trigger_handler(int irq, void *p)
>> +{
>> + struct iio_poll_func *pf = p;
>> + struct iio_dev *idev = pf->indio_dev;
>> + struct opt4060_chip *chip = iio_priv(idev);
>> + struct {
>> + u32 chan[OPT4060_NUM_CHANS];
>> + aligned_s64 ts;
>> + } raw;
>> + int i = 0;
>> + int chan, ret;
>> +
>> + /* If the trigger is coming for a different driver, a new sample is needed.*/
>
> from a different driver?
I have tried to clarify this comment in v10. When an external trigger such as
iio_sysfs_trigger is used, the sensor will not be running in continuous mode
and a new sample must be triggered.
>
>> + if (iio_trigger_validate_own_device(idev->trig, idev))
>> + opt4060_trigger_new_samples(idev);
>> +
>> + memset(&raw, 0, sizeof(raw));
>> +
>> + iio_for_each_active_channel(idev, chan) {
>> + if (chan == OPT4060_ILLUM)
>> + ret = opt4060_calc_illuminance(chip, &raw.chan[i++]);
>> + else
>> + ret = opt4060_read_raw_value(chip,
>> + idev->channels[chan].address,
>> + &raw.chan[i++]);
>> + if (ret) {
>> + dev_err(chip->dev, "Reading channel data failed\n");
>> + goto err_read;
>> + }
>> + }
>> +
>> + iio_push_to_buffers_with_timestamp(idev, &raw, pf->timestamp);
>> +err_read:
>> + iio_trigger_notify_done(idev->trig);
>> + return IRQ_HANDLED;
>> +}
>> +
>> +static irqreturn_t opt4060_irq_thread(int irq, void *private)
>> +{
>> + struct iio_dev *idev = private;
>> + struct opt4060_chip *chip = iio_priv(idev);
>> + int ret, dummy;
>> + unsigned int int_res;
>> +
>> + ret = regmap_read(chip->regmap, OPT4060_RES_CTRL, &int_res);
>> + if (ret < 0) {
>> + dev_err(chip->dev, "Failed to read interrupt reasons.\n");
>> + return IRQ_NONE;
>> + }
>> +
>> + /* Read OPT4060_CTRL to clear interrupt */
>> + ret = regmap_read(chip->regmap, OPT4060_CTRL, &dummy);
>> + if (ret < 0) {
>> + dev_err(chip->dev, "Failed to clear interrupt\n");
>> + return IRQ_NONE;
>> + }
>> +
>> + /* Handle events */
>> + if (int_res & (OPT4060_RES_CTRL_FLAG_H | OPT4060_RES_CTRL_FLAG_L)) {
>> + u64 code;
>> + int chan = 0;
>> +
>> + ret = opt4060_get_channel_sel(chip, &chan);
>> + if (ret) {
>> + dev_err(chip->dev, "Failed to read threshold channel.\n");
>> + return IRQ_NONE;
>> + }
>> +
>> + /* Check if the interrupt is from the lower threshold */
>> + if (int_res & OPT4060_RES_CTRL_FLAG_L) {
>> + code = IIO_MOD_EVENT_CODE(IIO_INTENSITY,
>> + chan,
>> + idev->channels[chan].channel2,
>> + IIO_EV_TYPE_THRESH,
>> + IIO_EV_DIR_FALLING);
>> + iio_push_event(idev, code, iio_get_time_ns(idev));
>> + }
>> + /* Check if the interrupt is from the upper threshold */
>> + if (int_res & OPT4060_RES_CTRL_FLAG_H) {
>> + code = IIO_MOD_EVENT_CODE(IIO_INTENSITY,
>> + chan,
>> + idev->channels[chan].channel2,
>> + IIO_EV_TYPE_THRESH,
>> + IIO_EV_DIR_RISING);
>> + iio_push_event(idev, code, iio_get_time_ns(idev));
>> + }
>> + }
>> +
>> + /* Handle conversion ready */
>> + if (int_res & OPT4060_RES_CTRL_CONV_READY) {
>> + /* Signal completion for potentially waiting reads */
>> + complete(&chip->completion);
>
> That looks problematic as you haven't necessarily reset the completion
> if the buffer is enabled. So you probably need a flag or something similar
> to say a sysfs read has been requested.
The completion is only used when triggering a new sample. The code will call
reinit_completion() which will set the internal counter to zero. The code will
then call wait_for_completion_timeout() which will wait for the counter to
increase. The call to complete() here will increase the counter each time it's
called but since reinit_completion() is called every time before waiting, I don't
think this is an issue.
>
>
>> +
>> + /* Handle data ready triggers */
>> + if (iio_buffer_enabled(idev))
>> + iio_trigger_poll_nested(chip->trig);
>> + }
>> + return IRQ_HANDLED;
>> +}
>
>> +static int opt4060_setup_trigger(struct opt4060_chip *chip, struct iio_dev *idev)
>> +{
>> + struct iio_trigger *data_trigger;
>> + char *name;
>> + int ret;
>> +
>> + data_trigger = devm_iio_trigger_alloc(chip->dev, "%s-data-ready-dev%d",
>> + idev->name, iio_device_id(idev));
>> + if (!data_trigger)
>> + return -ENOMEM;
>> +
>> + /* The data trigger allows for sample capture on each new conversion ready interrupt. */
>
> Make that a multiline comment.
Done in v10.
>
>> + chip->trig = data_trigger;
>> + data_trigger->ops = &opt4060_trigger_ops;
>> + iio_trigger_set_drvdata(data_trigger, idev);
>> + ret = devm_iio_trigger_register(chip->dev, data_trigger);
>> + if (ret)
>> + return dev_err_probe(chip->dev, ret,
>> + "Data ready trigger registration failed\n");
>> +
>> + name = devm_kasprintf(chip->dev, GFP_KERNEL, "%s-opt4060",
>> + dev_name(chip->dev));
>> + if (!name)
>> + return dev_err_probe(chip->dev, -ENOMEM, "Failed to alloc chip name\n");
>> +
>> + ret = devm_request_threaded_irq(chip->dev, chip->irq, NULL, opt4060_irq_thread,
>> + IRQF_TRIGGER_FALLING | IRQF_TRIGGER_RISING |
> That's unusual for a trigger type interrupt and seems likely to give a lot
> of spurious interrupts. Even if the pulse is short, some interrupt controllers
> will hang on to the bonus edge and trigger again when you reenable the interrupt.
>
> If intent is to use this for events, then I think you can configure it to latched
> window mode and one edge type and it will all work.
Removed IRQF_TRIGGER_RISING in v10, don't know how both ended up there...
>
>> + IRQF_ONESHOT, name, idev);
>> + if (ret)
>> + return dev_err_probe(chip->dev, ret, "Could not request IRQ\n");
>> +
>> + init_completion(&chip->completion);
>> +
>> + mutex_init(&chip->irq_setting_lock);
>
> Trivial and I might not even bother changing it, but slightly preference for
> ret = devm_mutex_init(...)
> if (ret)
> return ret;
>
>> +
>> + ret = regmap_write_bits(chip->regmap, OPT4060_INT_CTRL,
>> + OPT4060_INT_CTRL_OUTPUT,
>> + OPT4060_INT_CTRL_OUTPUT);
>> + if (ret)
>> + return dev_err_probe(chip->dev, ret,
>> + "Failed to set interrupt as output\n");
>> +
>> + return 0;
>> +}
Powered by blists - more mailing lists