lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20241202001846.GD6105@pendragon.ideasonboard.com>
Date: Mon, 2 Dec 2024 02:18:46 +0200
From: Laurent Pinchart <laurent.pinchart@...asonboard.com>
To: Ricardo Ribalda <ribalda@...omium.org>
Cc: Hans Verkuil <hverkuil-cisco@...all.nl>,
	Hans de Goede <hdegoede@...hat.com>,
	Mauro Carvalho Chehab <mchehab@...nel.org>,
	Guennadi Liakhovetski <guennadi.liakhovetski@...el.com>,
	Mauro Carvalho Chehab <mchehab+samsung@...nel.org>,
	linux-media@...r.kernel.org, linux-kernel@...r.kernel.org,
	stable@...r.kernel.org
Subject: Re: [PATCH v2 2/4] media: uvcvideo: Do not set an async control
 owned by other fh

On Fri, Nov 29, 2024 at 11:18:54PM +0100, Ricardo Ribalda wrote:
> On Fri, 29 Nov 2024 at 23:03, Laurent Pinchart wrote:
> > On Fri, Nov 29, 2024 at 07:47:31PM +0100, Ricardo Ribalda wrote:
> > > Before we all go on a well deserved weekend, let me recap what we
> > > know. If I did not get something correctly, let me know.
> > >
> > > 1) Well behaved devices do not allow to set or get an incomplete async
> > > control. They will stall instead (ref: Figure 2-21 in UVC 1.5 )
> > > 2) Both Laurent and Ricardo consider that there is a big chance that
> > > some camera modules do not implement this properly. (ref: years of
> > > crying over broken module firmware :) )
> > >
> > > 3) ctrl->handle is designed to point to the fh that originated the
> > > control. So the logic can decide if the originator needs to be
> > > notified or not. (ref: uvc_ctrl_send_event() )
> > > 4) Right now we replace the originator in ctrl->handle for unfinished
> > > async controls.  (ref:
> > > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/media/usb/uvc/uvc_ctrl.c#n2050)
> > >
> > > My interpretation is that:
> > > A) We need to change 4). We shall not change the originator of
> > > unfinished ctrl->handle.
> > > B) Well behaved cameras do not need the patch "Do not set an async
> > > control owned by another fh"
> > > C) For badly behaved cameras, it is fine if we slightly break the
> > > v4l2-compliance in corner cases, if we do not break any internal data
> > > structure.
> >
> > The fact that some devices may not implement the documented behaviour
> > correctly may not be a problem. Well-behaved devices will stall, which
> > means we shouldn't query the device while as async update is in
> > progress. Badly-behaved devices, whatever they do when queried, should
> > not cause any issue if we don't query them.
> 
> I thought we could detect the stall and return safely. Isn't that the case?

We could, but if we know the device will stall anyway, is there a reason
not to avoid issuing the request in the first place ?

> Why we have not seen issues with this?

I haven't tested a PTZ device for a very long time, and you would need
to hit a small time window to see the issue.

> > We should not send GET_CUR and SET_CUR requests to the device while an
> > async update is in progress, and use cached values instead. When we
> > receive the async update event, we should clear the cache. This will be
> > the same for both well-behaved and badly-behaved devices, so we can
> > expose the same behaviour towards userspace.
> 
> seting ctrl->loaded = 0 when we get an event sounds like a good idea
> and something we can implement right away.
> If I have to resend the set I will add it to the end.
> 
> > We possibly also need some kind of timeout mechanism to cope with the
> > async update event not being delivered by the device.
> 
> This is the part that worries me the most:
> - timeouts make the code fragile
> - What is a good value for timeout? 1 second, 30, 300? I do not think
> that we can find a value.

I've been thinking about the implementation of uvc_fh cleanup over the
weekend, and having a timeout would have the nice advantage that we
could reference-count uvc_fh instead of implementing a cleanup that
walks over all controls when closing a file handle. I think it would
make the code simpler, and possibly safer too.

> > Regarding the userspace behaviour during an auto-update, we have
> > multiple options:
> >
> > For control get,
> >
> > - We can return -EBUSY
> > - We can return the old value from the cache
> > - We can return the new value fromt he cache
> >
> > Returning -EBUSY would be simpler to implement.
>
> Not only easy, I think it is the most correct,
> 
> > I don't think the behaviour should depend on whether the control is read
> > on the file handle that initiated the async operation or on a different
> > file handle.
> >
> > For control set, I don't think we can do much else than returning
> > -EBUSY, regardless of which file handle the control is set on.
> 
> ACK.
> 
> > > I will send a new version with my interpretation.
> > >
> > > Thanks for a great discussion
> 
> Looking with some perspective... I believe that we should look into
> the "userspace behaviour for auto controls" in a different patchset.
> It is slightly unrelated to this discussion.

-- 
Regards,

Laurent Pinchart

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ