lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Mon, 17 Feb 2020 17:04:36 -0800
From:   Sowjanya Komatineni <skomatineni@...dia.com>
To:     Hans Verkuil <hverkuil@...all.nl>, <thierry.reding@...il.com>,
        <jonathanh@...dia.com>, <frankc@...dia.com>,
        <helen.koike@...labora.com>, <sboyd@...nel.org>
CC:     <linux-media@...r.kernel.org>, <devicetree@...r.kernel.org>,
        <linux-clk@...r.kernel.org>, <linux-tegra@...r.kernel.org>,
        <linux-kernel@...r.kernel.org>
Subject: Re: [RFC PATCH v3 4/6] media: tegra: Add Tegra210 Video input driver


On 2/17/20 4:59 PM, Sowjanya Komatineni wrote:
>
> On 2/17/20 12:04 AM, Hans Verkuil wrote:
>> External email: Use caution opening links or attachments
>>
>>
>> On 2/16/20 9:22 PM, Sowjanya Komatineni wrote:
>>> On 2/16/20 12:11 PM, Sowjanya Komatineni wrote:
>>>> On 2/16/20 11:54 AM, Sowjanya Komatineni wrote:
>>>>> On 2/16/20 3:03 AM, Hans Verkuil wrote:
>>>>>> External email: Use caution opening links or attachments
>>>>>>
>>>>>>
>>>>>> On 2/14/20 7:23 PM, Sowjanya Komatineni wrote:
>>>>>>> Tegra210 contains a powerful Video Input (VI) hardware controller
>>>>>>> which can support up to 6 MIPI CSI camera sensors.
>>>>>>>
>>>>>>> Each Tegra CSI port can be one-to-one mapped to VI channel and can
>>>>>>> capture from an external camera sensor connected to CSI or from
>>>>>>> built-in test pattern generator.
>>>>>>>
>>>>>>> Tegra210 supports built-in test pattern generator from CSI to VI.
>>>>>>>
>>>>>>> This patch adds a V4L2 media controller and capture driver support
>>>>>>> for Tegra210 built-in CSI to VI test pattern generator.
>>>>>>>
>>>>>>> Signed-off-by: Sowjanya Komatineni <skomatineni@...dia.com>
>>>>>>> ---
>>>>>>>    drivers/staging/media/Kconfig              |    2 +
>>>>>>>    drivers/staging/media/Makefile             |    1 +
>>>>>>>    drivers/staging/media/tegra/Kconfig        |   10 +
>>>>>>>    drivers/staging/media/tegra/Makefile       |    8 +
>>>>>>>    drivers/staging/media/tegra/TODO           |   10 +
>>>>>>>    drivers/staging/media/tegra/tegra-common.h |  239 +++++++
>>>>>>>    drivers/staging/media/tegra/tegra-csi.c    |  374 ++++++++++
>>>>>>>    drivers/staging/media/tegra/tegra-csi.h    |  115 ++++
>>>>>>>    drivers/staging/media/tegra/tegra-vi.c     | 1019
>>>>>>> ++++++++++++++++++++++++++++
>>>>>>>    drivers/staging/media/tegra/tegra-vi.h     |   79 +++
>>>>>>>    drivers/staging/media/tegra/tegra-video.c  |  118 ++++
>>>>>>>    drivers/staging/media/tegra/tegra-video.h  |   32 +
>>>>>>>    drivers/staging/media/tegra/tegra210.c     |  767
>>>>>>> +++++++++++++++++++++
>>>>>>>    drivers/staging/media/tegra/tegra210.h     |  190 ++++++
>>>>>>>    14 files changed, 2964 insertions(+)
>>>>>>>    create mode 100644 drivers/staging/media/tegra/Kconfig
>>>>>>>    create mode 100644 drivers/staging/media/tegra/Makefile
>>>>>>>    create mode 100644 drivers/staging/media/tegra/TODO
>>>>>>>    create mode 100644 drivers/staging/media/tegra/tegra-common.h
>>>>>>>    create mode 100644 drivers/staging/media/tegra/tegra-csi.c
>>>>>>>    create mode 100644 drivers/staging/media/tegra/tegra-csi.h
>>>>>>>    create mode 100644 drivers/staging/media/tegra/tegra-vi.c
>>>>>>>    create mode 100644 drivers/staging/media/tegra/tegra-vi.h
>>>>>>>    create mode 100644 drivers/staging/media/tegra/tegra-video.c
>>>>>>>    create mode 100644 drivers/staging/media/tegra/tegra-video.h
>>>>>>>    create mode 100644 drivers/staging/media/tegra/tegra210.c
>>>>>>>    create mode 100644 drivers/staging/media/tegra/tegra210.h
>>>>>>>
>>>>>> <snip>
>>>>>>
>>>>>>> +/*
>>>>>>> + * videobuf2 queue operations
>>>>>>> + */
>>>>>>> +static int tegra_channel_queue_setup(struct vb2_queue *vq,
>>>>>>> +                                  unsigned int *nbuffers,
>>>>>>> +                                  unsigned int *nplanes,
>>>>>>> +                                  unsigned int sizes[],
>>>>>>> +                                  struct device *alloc_devs[])
>>>>>>> +{
>>>>>>> +     struct tegra_vi_channel *chan = vb2_get_drv_priv(vq);
>>>>>>> +
>>>>>>> +     if (*nplanes)
>>>>>>> +             return sizes[0] < chan->format.sizeimage ? -EINVAL 
>>>>>>> : 0;
>>>>>>> +
>>>>>>> +     *nplanes = 1;
>>>>>>> +     sizes[0] = chan->format.sizeimage;
>>>>>>> +     alloc_devs[0] = chan->vi->dev;
>>>>>>> +
>>>>>>> +     /*
>>>>>>> +      * allocate min 3 buffers in queue to avoid race between DMA
>>>>>>> +      * writes and userspace reads.
>>>>>>> +      */
>>>>>>> +     if (*nbuffers < 3)
>>>>>>> +             *nbuffers = 3;
>>>>>> First of all, don't check this here, instead set the struct
>>>>>> vb2_queue field
>>>>>> 'min_buffers_needed' to 3 instead.
>>>>>>
>>>>>> But the reason given for this check is peculiar: there should not be
>>>>>> any
>>>>>> race at all. Usually the reason for requiring a specific minimum
>>>>>> number of
>>>>>> buffers is that the DMA engine needs at least 2 buffers before it
>>>>>> can start
>>>>>> streaming: it can't give back a buffer to userspace 
>>>>>> (vb2_buffer_done())
>>>>>> unless there is a second buffer it can start to capture to next. So
>>>>>> for many
>>>>>> DMA implementations you need a minimum of 2 buffers: two buffers for
>>>>>> the
>>>>>> DMA engine, one buffer being processed by userspace.
>>>>>>
>>>>>> If the driver is starved of buffers it will typically keep 
>>>>>> capturing to
>>>>>> the last buffer until a new buffer is queued.
>>>>>>
>>>>>> In any case, once the driver releases a buffer via vb2_buffer_done()
>>>>>> the
>>>>>> buffer memory is no longer owned by the driver.
>>>>>>
>>>>>> To be precise, buffer ownership is as follows:
>>>>>>
>>>>>> userspace -> VIDIOC_QBUF -> vb2 -> buf_queue -> driver ->
>>>>>> vb2_buffer_done() -> vb2 -> VIDIOC_DQBUF -> userspace
>>>>>>
>>>>>> (vb2 == videobuf2 framework)
>>>>>>
>>>>>> Note that vb2 never touches the buffer memory.
>>>>>>
>>>>>> So if you get a race condition in this driver, then there is 
>>>>>> something
>>>>>> strange going on. It looks like vb2_buffer_done() is called while
>>>>>> DMA is
>>>>>> still ongoing, or because the driver really needs to keep one buffer
>>>>>> available at all times.
>>>>>>
>>>>>> Regards,
>>>>>>
>>>>>>           Hans
>>>>> Thanks Hans.
>>>>>
>>>>> On running v4l2-compliance streaming tests for longer run, I noticed
>>>>> kernel reporting unable to write to read-only memory and with debugs
>>>>> I observed when this error was reported, I see 2 buffers queued and
>>>>> both using same address.
>>>>>
>>>>> for first buffer capture start thread initiates capture and wakes
>>>>> done thread to wait for memory write ack and once its done buffer is
>>>>> released to user space but I see upon buffer released to user space
>>>>> immediate next buffer capture single shot gets issued (as soon as
>>>>> single shot is issued frame capture data is written to memory by DMA)
>>>>> and I see this kernel error of unable to write to read-only memory.
>>>>>
>>>>> This error happens rare and happens on long run and all the times of
>>>>> repro's, I see when other thread releases buffer immediate I see
>>>>> single shot gets issued as 2 buffers are queued up at the same time
>>>>> with same DMA address.
>>>>>
>>>> Just to be clear, I meant all the times when kernel reports error
>>>> unable to write to read-only memory, I see 2 buffers gets queued and
>>>> as the capture start thread and done thread are parallel and when
>>>> capture thread wakes done thread on receiving FS event, done thread
>>>> for waiting for memory write happens parallel to next frame capture
>>>> and I see while vb2_buffer_done happens in done thread next frame
>>>> single shot has been issues by capture start thread in parallel when
>>>> it hits this error.
>>> For low latency, we use 2 threads one thread for capture and wait 
>>> for FS
>>> and on receiving FS even wakes other done thread to wait for memory
>>> write to finish.
>>>
>>> While other done thread waits for memory write to finish, capture 
>>> thread
>>> can start capture for next frame and as soon as single shot is issued
>>> capture frame is written to memory and as this thread runs in parallel
>>> to done thread
>>>
>>> there is a possibility vb2_buffer_done being called by
>>> kthread_capture_done while DMA is ongoing by kthread_capture_start 
>>> and I
>>> observed same DMA address being used got both buffers that got 
>>> queued at
>>> same time when it hits this error.
>> "buffers that got queued": you mean that tegra_channel_buffer_queue() is
>> called twice with different buffers (i.e. with different buffer index 
>> values)
>> but with the same DMA address?
>
> yes I see buf_queue invoked twice and both using same DMA address.
>
> I did not monitored buf index when I repro'd with debugs.
>
>> That should not happen (unless the first buffer was returned with
>> vb2_buffer_done() before the second buffer was queued).
>>
>> Can you provide more details? E.g. buffer index, memory model used when
>> streaming, total number of buffers allocated by REQBUFS.
>
> memory type is MMAP. buffer count is 1. Didn't monitored buf index but 
> when this happened as count is 1 so I think it might be index 0.
>
> But what I noticed is on running long hours rarely this repro's and at 
> the time of repro, I see 2 immediate buf_queue with same DMA address.
>
> 1st buffer capture starts and received FS and wakes other thread to 
> wait for MW_ACK_DONE and when capture thread executes for 2nd buffer 
> right as soon as single shot bit is set which writes captured data to 
> memory kernel error of unable to write to read-only memory happens.
>
> I couldn't add more debugs to confirm if 1st buffer release finished 
> or not by the time 2nd buffer single shot is issued but I see 
> MW_ACK_DONE event received for 1st buffer.
>
> Adding more debugs does not repro this kernel error and that could be 
> because delays with debug messages helps time interval b/w buffer 
> release and next buffer single shot.
>
>>
>> I would like to fully understand this. Just increasing the minimum 
>> number
>> of buffers, while reasonable by itself, *does* feel like it is just
>> hiding the symptoms.
>>
>> Regards,
>>
>>          Hans
>
> kthread for capture start and kthread for done which waiting for 
> memory write to happen runs in parallel.
>
> Its hard to confirm if buffer done is invoked by kthread_done at same 
> time of kthread_capture DMA write of next frame as adding debugs has 
> impact on delay and doesnt repro with more debugs.
>
> But just increasing no.of min buffers to 3 doesnt repro at all and I 
> never see 2 buffer queue requests with same DMA address as min buffers 
> are 3.
>
Will add debugs and monitor with index again and check for repro by 
removing min buffers to re-confirm ...
>
>>>>> With using minimum 3 buffers, this issue doesnt happen at all from
>>>>> almost 72 hours of testing.
>>>>>
>>>>>
>>>>> Will try with setting vb2 queue field min_buffers_needed as 3 instead
>>>>> of adding check in queue setup.
>>>>>
>>>>>>> +
>>>>>>> +     return 0;
>>>>>>> +}

Powered by blists - more mailing lists