linux-kernel - Re: [PATCH v9 2/2] iio: light: Add support for TI OPT4060 color sensor

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <c5d5575e-7df3-4e2e-9c00-d9b71959bd02@axis.com>
Date: Wed, 18 Dec 2024 11:46:20 +0100
From: Per-Daniel Olsson <perdaniel.olsson@...s.com>
To: Jonathan Cameron <jic23@...nel.org>
Cc: Lars-Peter Clausen <lars@...afoo.de>, Rob Herring <robh@...nel.org>,
 Krzysztof Kozlowski <krzk+dt@...nel.org>, Conor Dooley
 <conor+dt@...nel.org>, Javier Carrasco <javier.carrasco.cruz@...il.com>,
 linux-iio@...r.kernel.org, linux-kernel@...r.kernel.org,
 devicetree@...r.kernel.org, rickard.andersson@...s.com, kernel@...s.com
Subject: Re: [PATCH v9 2/2] iio: light: Add support for TI OPT4060 color
 sensor

On 12/14/24 19:21, Jonathan Cameron wrote:
> On Wed, 11 Dec 2024 15:04:09 +0100
> Per-Daniel Olsson <perdaniel.olsson@...s.com> wrote:
> 
>> Add support for Texas Instruments OPT4060 RGBW Color sensor.
>>
>> Signed-off-by: Per-Daniel Olsson <perdaniel.olsson@...s.com>
> 
> Hi Per-Daniel,
> 
> I think this is nearly there, but still some races or at least places
> I can't currently convince myself aren't races.
> 
> I remember long ago being a wimp and failing to implement a similar dance
> in the max1363 ADC driver. That manages one more complexity in that in
> it's continuous mode if events are enabled, the data fields move.
> It still only supports one of events, sysfs read back or buffered output,
> not any combination.  Maybe if I can find the hardware I'll revisit that
> one day.
> 
> Thanks,
> 
> Jonathan

Hi Jonathan,

Thank you for your comments. Also thank you for sharing your story about max1363,
sounds like a tricky piece of silicon. Great to hear that you think it can be 
worth the time to implement concurrent use cases, I was a little worried that I
was pushing that part a little too far... :)

I have added comments below.

Thanks (and happy holidays if I don't hear back from you before),

Per-Daniel

> 
>> diff --git a/drivers/iio/light/opt4060.c b/drivers/iio/light/opt4060.c
>> new file mode 100644
>> index 000000000000..4a7d970c5d7c
>> --- /dev/null
>> +++ b/drivers/iio/light/opt4060.c
> 
> 
>> +struct opt4060_chip {
>> +	struct regmap *regmap;
>> +	struct device *dev;
>> +	struct iio_trigger *trig;
>> +	u8 int_time;
>> +	int irq;
>> +	struct mutex irq_setting_lock;
> 
> General rule is all locks need a comment on what data they protect
> even when they have a nice specific name.  Bit advantage is it makes
> it clear what they are not designed to protect!

Done in v10.

> 
>> +	struct completion completion;
>> +	bool thresh_event_lo_active;
>> +	bool thresh_event_hi_active;
>> +};
> 
>> +static void opt4060_claim_irq_setting_lock(struct opt4060_chip *chip)
>> +{
>> +	if (chip->irq)
> I'm struggling to see why you'd be messing with irqs if you don't have
> any?  I can't see a path in which chip->irq isn't set (which makes sense!)
> and you get here.

Good point, fixed in v10.  

> 
> So I think you can just move the mutex inline which avoids the mess
> of lockdep warnings etc and let's you use guard() to simplify things.
> 
>> +		mutex_lock(&chip->irq_setting_lock);
>> +}
>> +
>> +static void opt4060_release_irq_setting_lock(struct opt4060_chip *chip)
>> +{
>> +	if (chip->irq)
>> +		mutex_unlock(&chip->irq_setting_lock);
>> +}
>> +
>> +static int opt4060_set_int_state(struct opt4060_chip *chip, u32 state)
>> +{
>> +	int ret;
>> +	unsigned int regval;
>> +
>> +	opt4060_claim_irq_setting_lock(chip);
>> +	regval = FIELD_PREP(OPT4060_INT_CTRL_INT_CFG, state);
>> +	ret = regmap_update_bits(chip->regmap, OPT4060_INT_CTRL,
>> +				 OPT4060_INT_CTRL_INT_CFG, regval);
>> +	if (ret)
>> +		dev_err(chip->dev, "Failed to set interrupt config\n");
>> +	opt4060_release_irq_setting_lock(chip);
>> +	return ret;
>> +}
>> +
>> +static int opt4060_set_continuous_mode(struct opt4060_chip *chip,
>> +				       bool continuous)
>> +{
>> +	unsigned int reg;
>> +	int ret;
>> +
>> +	ret = regmap_read(chip->regmap, OPT4060_CTRL, &reg);
> 
> You could use a regmap_update_bits() call to simplify this like you 
> do for the int_state above.

That register is a little special. It requires a real write sometimes
even is the data hasn't changed, so regmap_update_bits won't work. This
is needed for triggering one shot mode. I have renamed the function in
v10 since it don't just set continuous mode but also one shot. I have
also changed the comment in the code.

> 
> 
>> +	if (ret < 0) {
>> +		dev_err(chip->dev, "Failed to read ctrl register\n");
>> +		return ret;
>> +	}
>> +	reg &= ~OPT4060_CTRL_OPER_MODE_MASK;
>> +	if (continuous)
>> +		reg |= FIELD_PREP(OPT4060_CTRL_OPER_MODE_MASK,
>> +				  OPT4060_CTRL_OPER_MODE_CONTINUOUS);
>> +	else
>> +		reg |= FIELD_PREP(OPT4060_CTRL_OPER_MODE_MASK,
>> +				  OPT4060_CTRL_OPER_MODE_ONE_SHOT);
>> +
>> +	/* Trigger a new conversions by writing to CRTL register. */
>> +	ret = regmap_write(chip->regmap, OPT4060_CTRL, reg);
>> +	if (ret)
>> +		dev_err(chip->dev, "Failed to set ctrl register\n");
>> +	return ret;
>> +}
>> +
>> +static bool opt4060_event_active(struct opt4060_chip *chip)
>> +{
>> +	return chip->thresh_event_lo_active || chip->thresh_event_hi_active;
>> +}
>> +
>> +static int opt4060_set_state_common(struct opt4060_chip *chip,
>> +				    bool continuous_sampling,
>> +				    bool continuous_irq, bool direct_mode)
>> +{
>> +	int ret = 0;
>> +
>> +	/* It is important to setup irq before sampling to avoid missing samples. */
>> +	if (continuous_irq || !direct_mode)
>> +		ret = opt4060_set_int_state(chip, OPT4060_INT_CTRL_ALL_CH);
>> +	else if (direct_mode)
>> +		ret = opt4060_set_int_state(chip, OPT4060_INT_CTRL_THRESHOLD);
>> +	if (ret) {
>> +		dev_err(chip->dev, "Failed to set irq state.\n");
>> +		return ret;
>> +	}
>> +
>> +	if (continuous_sampling || !direct_mode || opt4060_event_active(chip))
> 
> I think there may also a race around the event active check.  You could have
> one event direction being enabled concurrently with the other being disabled.
> I'm not sure it matters but worth checking.

I think you might be correct even if I haven't been able to trigger the case in
my tests. I have added a guard for this in v10. If opt4060_write_event_config()
is called from several callers at the same time, the whole sequence all the way
down to setting sampling and irq will be synchronized with the mutex/guard.

> 
> Side effect of either claiming direct or buffered mode is that only one
> caller can do it at a time, so that would close this race as well. 
> Having said that, it's an implementation detail of the core (be it one that
> has been there a long time) so you should really have your own driver
> specific locking scheme prevent that.

I have implemented the "dance" in v10 and it seems to work well in my tests.

> 
> 
>> +		ret = opt4060_set_continuous_mode(chip, true);
>> +	else if (direct_mode)
>> +		ret = opt4060_set_continuous_mode(chip, false);
>> +	if (ret)
>> +		dev_err(chip->dev, "Failed to set sampling state.\n");
>> +	return ret;
>> +}
>> +
>> +/*
>> + * Function for setting the driver state for sampling and irq. When disabling
>> + * continuous sampling or irq, the IIO direct mode must be claimed to prevent
>> + * races with buffer enabling/disabling. In the case when the direct mode is
>> + * not possible to claim, the function will keep continuous mode. All
>> + * functions, sysfs read, events and buffer, work in continuous mode.
>> + */
>> +static int opt4060_set_driver_state(struct iio_dev *indio_dev,
>> +				    bool continuous_sampling,
>> +				    bool continuous_irq)
>> +{
>> +	struct opt4060_chip *chip = iio_priv(indio_dev);
>> +	bool direct_mode = false;
>> +	int ret = 0;
>> +
>> +	if (!iio_device_claim_direct_mode(indio_dev))
>> +		direct_mode = true;
> 
> Hmm. I'm dubious about this pattern. Why is it fine if the driver
> leaves buffered mode right here? I was expecting this to do
> the dance with claiming either direct mode or buffered mode.
> (with the retry loop).  Direct mode that you pass into the next
> call may well be false when it should be true.
> 
> Even if you can reason why that isn't a problem (and there are worse
> dances where it switches mode multiple times during your call
> of the next function to consider) I think it is easier to reason
> about if we know it is definitely not changing state until we
> release it.

"dance" implemented in v10.

> 
>> +
>> +	ret = opt4060_set_state_common(chip, continuous_sampling,
>> +				       continuous_irq, direct_mode);
>> +
>> +	if (direct_mode)
>> +		iio_device_release_direct_mode(indio_dev);
>> +	return ret;
>> +}
>> +
>> +/*
>> + * This function is called in direct mode from the framework.
>> + */
>> +static int opt4060_trigger_set_state(struct iio_trigger *trig, bool state)
>> +{
>> +	struct iio_dev *indio_dev = iio_trigger_get_drvdata(trig);
>> +	struct opt4060_chip *chip = iio_priv(indio_dev);
>> +	int ret = 0;
>> +
>> +	return ret = opt4060_set_state_common(chip, state, state, true);
> 
> return opt_set_state_common() is probably the intent.
> 
>> +}
> 
>> +static int opt4060_trigger_new_samples(struct iio_dev *indio_dev)
>> +{
>> +	struct opt4060_chip *chip = iio_priv(indio_dev);
>> +	int ret;
>> +
>> +	/*
>> +	 * The conversion time should be 500us startup time plus the integration time
>> +	 * times the number of channels. An exact timeout isn't critical, it's better
>> +	 * not to get incorrect errors in the log. Setting the timeout to double the
>> +	 * theoretical time plus and extra 100ms margin.
>> +	 */
>> +	unsigned int timeout_us = (500 + OPT4060_NUM_CHANS *
>> +				  opt4060_int_time_reg[chip->int_time][0]) * 2 + 100000;
>> +
>> +	/* Setting the state in one shot mode with irq on each sample. */
>> +	ret = opt4060_set_driver_state(indio_dev, false, true);
>> +	if (ret)
>> +		return ret;
>> +
>> +	if (chip->irq) {
>> +		reinit_completion(&chip->completion);
>> +		opt4060_claim_irq_setting_lock(chip);
>> +		if (wait_for_completion_timeout(&chip->completion,
>> +						usecs_to_jiffies(timeout_us)) == 0) {
>> +			dev_err(chip->dev, "Completion timed out.\n");
>> +			opt4060_release_irq_setting_lock(chip);
> 
> This is where exposing the lock directly will simplify things as you can just use
> a guard.

Fixed with guard in v10.

> 
>> +			return -ETIME;
>> +		}
>> +		opt4060_release_irq_setting_lock(chip);
>> +	} else {
>> +		unsigned int ready;
>> +
>> +		ret = regmap_read_poll_timeout(chip->regmap, OPT4060_RES_CTRL,
>> +					       ready, (ready & OPT4060_RES_CTRL_CONV_READY),
>> +					       1000, timeout_us);
>> +		if (ret)
>> +			dev_err(chip->dev, "Conversion ready did not finish within timeout.\n");
>> +	}
>> +	/* Setting the state in one shot mode with irq on thresholds. */
>> +	ret = opt4060_set_driver_state(indio_dev, false, false);
>> +
>> +	return ret;
> 
> 	return opt4060_...
> 
>> +}
> 
>> +static int opt4060_write_raw_get_fmt(struct iio_dev *indio_dev,
>> +				     struct iio_chan_spec const *chan,
>> +				     long mask)
>> +{
>> +	switch (mask) {
>> +	case IIO_CHAN_INFO_INT_TIME:
>> +		return IIO_VAL_INT_PLUS_MICRO;
> IIRC That's the default, so you don't need to provide write_raw_get_fmt,
> though no harm in doing so I guess.

Ok, wasn't aware of that being the default.

> 
>> +	default:
>> +		return -EINVAL;
>> +	}
>> +}
> 
> 
>> +static int opt4060_get_channel_sel(struct opt4060_chip *chip, int *ch_sel)
>> +{
>> +	int ret;
>> +	u32 regval;
>> +
>> +	ret = regmap_read(chip->regmap, OPT4060_INT_CTRL, &regval);
>> +	if (ret)
>> +		dev_err(chip->dev, "Failed to get channel selection.\n");
> 
> if you have garbage, not sure it's valid to update ch_sel.
> 
>> +	*ch_sel = FIELD_GET(OPT4060_INT_CTRL_THRESH_SEL, regval);
>> +	return ret;
>> +}
>> +
> 
> 
>> +static int opt4060_write_event_config(struct iio_dev *indio_dev,
>> +				      const struct iio_chan_spec *chan,
>> +				      enum iio_event_type type,
>> +				      enum iio_event_direction dir, bool state)
>> +{
>> +	int ch_sel, ch_idx = chan->scan_index;
>> +	struct opt4060_chip *chip = iio_priv(indio_dev);
>> +	int ret;
>> +
>> +	if (chan->type != IIO_INTENSITY)
>> +		return -EINVAL;
>> +	if (type != IIO_EV_TYPE_THRESH)
>> +		return -EINVAL;
>> +
>> +	ret = opt4060_get_channel_sel(chip, &ch_sel);
>> +	if (ret)
>> +		return ret;
>> +
>> +	if (state) {
>> +		/* Only one channel can be active at the same time */
>> +		if ((chip->thresh_event_lo_active ||
>> +			chip->thresh_event_hi_active) && (ch_idx != ch_sel))
> 
> That's a bit nasty to ready. I'd use a slightly long line and get the || pair
> on the first line.

Fixed in v10.

> 
> Hmm. We've never made rules on this but some devices to fifo type
> selection if they have limitations on events enabled at the same time.
> With hindsight I think this scheme of just saying no is probably more
> user friendly.

Ok.

> 
>> +			return -EBUSY;
>> +		if (dir == IIO_EV_DIR_FALLING)
>> +			chip->thresh_event_lo_active = true;
>> +		else if (dir == IIO_EV_DIR_RISING)
>> +			chip->thresh_event_hi_active = true;
>> +		ret = opt4060_set_channel_sel(chip, ch_idx);
>> +		if (ret)
>> +			return ret;
>> +	} else {
>> +		if (ch_idx == ch_sel) {
>> +			if (dir == IIO_EV_DIR_FALLING)
>> +				chip->thresh_event_lo_active = false;
>> +			else if (dir == IIO_EV_DIR_RISING)
>> +				chip->thresh_event_hi_active = false;
>> +		}
>> +	}
>> +
>> +	return opt4060_set_driver_state(indio_dev, chip->thresh_event_hi_active |
>> +					chip->thresh_event_lo_active, false);
> Maybe wrap it to have the | pair on lines with nothing else.  They are a little bit burried
> otherwise.

Fixed in v10.

> 	return opt4060_set_driver_state(indio_dev,
> 					chip->thresh_event_hi_active |
> 					chip->thresh_event_lo_active,
> 					false);
> 
>> +}
>> +
>> +static const struct iio_info opt4060_info = {
>> +	.read_raw = opt4060_read_raw,
>> +	.write_raw = opt4060_write_raw,
>> +	.write_raw_get_fmt = opt4060_write_raw_get_fmt,
>> +	.read_avail = opt4060_read_available,
>> +	.read_event_value = opt4060_read_event,
>> +	.write_event_value = opt4060_write_event,
>> +	.read_event_config = opt4060_read_event_config,
>> +	.write_event_config = opt4060_write_event_config,
>> +};
> 
> Given you have option for no irq it is probably worth picking a version of this
> info structure with all the event callbacks removed.   Technically it isn't
> required but it does harden the code (by crashing horribly if you call them ;)
>

Good point, I have added a separate version without those callbacks.
 
> 
> 
>> +static irqreturn_t opt4060_trigger_handler(int irq, void *p)
>> +{
>> +	struct iio_poll_func *pf = p;
>> +	struct iio_dev *idev = pf->indio_dev;
>> +	struct opt4060_chip *chip = iio_priv(idev);
>> +	struct  {
>> +		u32 chan[OPT4060_NUM_CHANS];
>> +		aligned_s64 ts;
>> +	} raw;
>> +	int i = 0;
>> +	int chan, ret;
>> +
>> +	/* If the trigger is coming for a different driver, a new sample is needed.*/
> 
> from a different driver?

I have tried to clarify this comment in v10. When an external trigger such as
iio_sysfs_trigger is used, the sensor will not be running in continuous mode
and a new sample must be triggered.

> 
>> +	if (iio_trigger_validate_own_device(idev->trig, idev))
>> +		opt4060_trigger_new_samples(idev);
>> +
>> +	memset(&raw, 0, sizeof(raw));
>> +
>> +	iio_for_each_active_channel(idev, chan) {
>> +		if (chan == OPT4060_ILLUM)
>> +			ret = opt4060_calc_illuminance(chip, &raw.chan[i++]);
>> +		else
>> +			ret = opt4060_read_raw_value(chip,
>> +						     idev->channels[chan].address,
>> +						     &raw.chan[i++]);
>> +		if (ret) {
>> +			dev_err(chip->dev, "Reading channel data failed\n");
>> +			goto err_read;
>> +		}
>> +	}
>> +
>> +	iio_push_to_buffers_with_timestamp(idev, &raw, pf->timestamp);
>> +err_read:
>> +	iio_trigger_notify_done(idev->trig);
>> +	return IRQ_HANDLED;
>> +}
>> +
>> +static irqreturn_t opt4060_irq_thread(int irq, void *private)
>> +{
>> +	struct iio_dev *idev = private;
>> +	struct opt4060_chip *chip = iio_priv(idev);
>> +	int ret, dummy;
>> +	unsigned int int_res;
>> +
>> +	ret = regmap_read(chip->regmap, OPT4060_RES_CTRL, &int_res);
>> +	if (ret < 0) {
>> +		dev_err(chip->dev, "Failed to read interrupt reasons.\n");
>> +		return IRQ_NONE;
>> +	}
>> +
>> +	/* Read OPT4060_CTRL to clear interrupt */
>> +	ret = regmap_read(chip->regmap, OPT4060_CTRL, &dummy);
>> +	if (ret < 0) {
>> +		dev_err(chip->dev, "Failed to clear interrupt\n");
>> +		return IRQ_NONE;
>> +	}
>> +
>> +	/* Handle events */
>> +	if (int_res & (OPT4060_RES_CTRL_FLAG_H | OPT4060_RES_CTRL_FLAG_L)) {
>> +		u64 code;
>> +		int chan = 0;
>> +
>> +		ret = opt4060_get_channel_sel(chip, &chan);
>> +		if (ret) {
>> +			dev_err(chip->dev, "Failed to read threshold channel.\n");
>> +			return IRQ_NONE;
>> +		}
>> +
>> +		/* Check if the interrupt is from the lower threshold */
>> +		if (int_res & OPT4060_RES_CTRL_FLAG_L) {
>> +			code = IIO_MOD_EVENT_CODE(IIO_INTENSITY,
>> +						  chan,
>> +						  idev->channels[chan].channel2,
>> +						  IIO_EV_TYPE_THRESH,
>> +						  IIO_EV_DIR_FALLING);
>> +			iio_push_event(idev, code, iio_get_time_ns(idev));
>> +		}
>> +		/* Check if the interrupt is from the upper threshold */
>> +		if (int_res & OPT4060_RES_CTRL_FLAG_H) {
>> +			code = IIO_MOD_EVENT_CODE(IIO_INTENSITY,
>> +						  chan,
>> +						  idev->channels[chan].channel2,
>> +						  IIO_EV_TYPE_THRESH,
>> +						  IIO_EV_DIR_RISING);
>> +			iio_push_event(idev, code, iio_get_time_ns(idev));
>> +		}
>> +	}
>> +
>> +	/* Handle conversion ready */
>> +	if (int_res & OPT4060_RES_CTRL_CONV_READY) {
>> +		/* Signal completion for potentially waiting reads */
>> +		complete(&chip->completion);
> 
> That looks problematic as you haven't necessarily reset the completion
> if the buffer is enabled.  So you probably need a flag or something similar
> to say a sysfs read has been requested.

The completion is only used when triggering a new sample. The code will call
reinit_completion() which will set the internal counter to zero. The code will
then call wait_for_completion_timeout() which will wait for the counter to
increase. The call to complete() here will increase the counter each time it's
called but since reinit_completion() is called every time before waiting, I don't
think this is an issue.

> 
> 
>> +
>> +		/* Handle data ready triggers */
>> +		if (iio_buffer_enabled(idev))
>> +			iio_trigger_poll_nested(chip->trig);
>> +	}
>> +	return IRQ_HANDLED;
>> +}
> 
>> +static int opt4060_setup_trigger(struct opt4060_chip *chip, struct iio_dev *idev)
>> +{
>> +	struct iio_trigger *data_trigger;
>> +	char *name;
>> +	int ret;
>> +
>> +	data_trigger = devm_iio_trigger_alloc(chip->dev, "%s-data-ready-dev%d",
>> +					      idev->name, iio_device_id(idev));
>> +	if (!data_trigger)
>> +		return -ENOMEM;
>> +
>> +	/* The data trigger allows for sample capture on each new conversion ready interrupt. */
> 
> Make that a multiline comment.

Done in v10.

> 
>> +	chip->trig = data_trigger;
>> +	data_trigger->ops = &opt4060_trigger_ops;
>> +	iio_trigger_set_drvdata(data_trigger, idev);
>> +	ret = devm_iio_trigger_register(chip->dev, data_trigger);
>> +	if (ret)
>> +		return dev_err_probe(chip->dev, ret,
>> +				     "Data ready trigger registration failed\n");
>> +
>> +	name = devm_kasprintf(chip->dev, GFP_KERNEL, "%s-opt4060",
>> +			      dev_name(chip->dev));
>> +	if (!name)
>> +		return dev_err_probe(chip->dev, -ENOMEM, "Failed to alloc chip name\n");
>> +
>> +	ret = devm_request_threaded_irq(chip->dev, chip->irq, NULL, opt4060_irq_thread,
>> +					IRQF_TRIGGER_FALLING | IRQF_TRIGGER_RISING |
> That's unusual for a trigger type interrupt and seems likely to give a lot
> of spurious interrupts.  Even if the pulse is short, some interrupt controllers
> will hang on to the bonus edge and trigger again when you reenable the interrupt.
> 
> If intent is to use this for events, then I think you can configure it to latched
> window mode and one edge type and it will all work.

Removed IRQF_TRIGGER_RISING in v10, don't know how both ended up there...

> 
>> +					IRQF_ONESHOT, name, idev);
>> +	if (ret)
>> +		return dev_err_probe(chip->dev, ret, "Could not request IRQ\n");
>> +
>> +	init_completion(&chip->completion);
>> +
>> +	mutex_init(&chip->irq_setting_lock);
> 
> Trivial and I might not even bother changing it, but slightly preference for
> 	ret = devm_mutex_init(...)
> 	if (ret)
> 		return ret;
> 
>> +
>> +	ret = regmap_write_bits(chip->regmap, OPT4060_INT_CTRL,
>> +				OPT4060_INT_CTRL_OUTPUT,
>> +				OPT4060_INT_CTRL_OUTPUT);
>> +	if (ret)
>> +		return dev_err_probe(chip->dev, ret,
>> +				     "Failed to set interrupt as output\n");
>> +
>> +	return 0;
>> +}