lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAPBb6MV+MTuYPFUxfgK6hkrXKXYUz-Jc6iPES9E3jQyQJ9gNCg@mail.gmail.com>
Date:   Mon, 29 May 2017 16:29:59 +0900
From:   Alexandre Courbot <acourbot@...omium.org>
To:     Jacek Anaszewski <jacek.anaszewski@...il.com>
Cc:     Andrzej Pietrasiewicz <andrzej.p@...sung.com>,
        Mauro Carvalho Chehab <mchehab@...nel.org>,
        linux-media@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH] [media] s5p-jpeg: fix recursive spinlock acquisition

Hi everyone,

On Thu, Apr 27, 2017 at 4:35 AM, Jacek Anaszewski
<jacek.anaszewski@...il.com> wrote:
> On 04/26/2017 04:54 AM, Alexandre Courbot wrote:
>> On Wed, Apr 26, 2017 at 4:15 AM, Jacek Anaszewski
>> <jacek.anaszewski@...il.com> wrote:
>>> Hi Alexandre,
>>>
>>> Thanks for the patch.
>>>
>>> On 04/25/2017 08:19 AM, Alexandre Courbot wrote:
>>>> v4l2_m2m_job_finish(), which is called from the interrupt handler with
>>>> slock acquired, can call the device_run() hook immediately if another
>>>> context was in the queue. This hook also acquires slock, resulting in
>>>> a deadlock for this scenario.
>>>>
>>>> Fix this by releasing slock right before calling v4l2_m2m_job_finish().
>>>> This is safe to do as the state of the hardware cannot change before
>>>> v4l2_m2m_job_finish() is called anyway.
>>>>
>>>> Signed-off-by: Alexandre Courbot <acourbot@...omium.org>
>>>> ---
>>>>  drivers/media/platform/s5p-jpeg/jpeg-core.c | 12 +++++++++---
>>>>  1 file changed, 9 insertions(+), 3 deletions(-)
>>>>
>>>> diff --git a/drivers/media/platform/s5p-jpeg/jpeg-core.c b/drivers/media/platform/s5p-jpeg/jpeg-core.c
>>>> index 52dc7941db65..223b4379929e 100644
>>>> --- a/drivers/media/platform/s5p-jpeg/jpeg-core.c
>>>> +++ b/drivers/media/platform/s5p-jpeg/jpeg-core.c
>>>> @@ -2642,13 +2642,13 @@ static irqreturn_t s5p_jpeg_irq(int irq, void *dev_id)
>>>>       if (curr_ctx->mode == S5P_JPEG_ENCODE)
>>>>               vb2_set_plane_payload(&dst_buf->vb2_buf, 0, payload_size);
>>>>       v4l2_m2m_buf_done(dst_buf, state);
>>>> -     v4l2_m2m_job_finish(jpeg->m2m_dev, curr_ctx->fh.m2m_ctx);
>>>>
>>>>       curr_ctx->subsampling = s5p_jpeg_get_subsampling_mode(jpeg->regs);
>>>>       spin_unlock(&jpeg->slock);
>>>>
>>>>       s5p_jpeg_clear_int(jpeg->regs);
>>>>
>>>> +     v4l2_m2m_job_finish(jpeg->m2m_dev, curr_ctx->fh.m2m_ctx);
>>>>       return IRQ_HANDLED;
>>>>  }
>>>>
>>>> @@ -2707,11 +2707,12 @@ static irqreturn_t exynos4_jpeg_irq(int irq, void *priv)
>>>>               v4l2_m2m_buf_done(dst_vb, VB2_BUF_STATE_ERROR);
>>>>       }
>>>>
>>>> -     v4l2_m2m_job_finish(jpeg->m2m_dev, curr_ctx->fh.m2m_ctx);
>>>>       if (jpeg->variant->version == SJPEG_EXYNOS4)
>>>>               curr_ctx->subsampling = exynos4_jpeg_get_frame_fmt(jpeg->regs);
>>>>
>>>>       spin_unlock(&jpeg->slock);
>>>> +
>>>> +     v4l2_m2m_job_finish(jpeg->m2m_dev, curr_ctx->fh.m2m_ctx);
>>>>       return IRQ_HANDLED;
>>>>  }
>>>>
>>>> @@ -2770,10 +2771,15 @@ static irqreturn_t exynos3250_jpeg_irq(int irq, void *dev_id)
>>>>       if (curr_ctx->mode == S5P_JPEG_ENCODE)
>>>>               vb2_set_plane_payload(&dst_buf->vb2_buf, 0, payload_size);
>>>>       v4l2_m2m_buf_done(dst_buf, state);
>>>> -     v4l2_m2m_job_finish(jpeg->m2m_dev, curr_ctx->fh.m2m_ctx);
>>>>
>>>>       curr_ctx->subsampling =
>>>>                       exynos3250_jpeg_get_subsampling_mode(jpeg->regs);
>>>> +
>>>> +     spin_unlock(&jpeg->slock);
>>>> +
>>>> +     v4l2_m2m_job_finish(jpeg->m2m_dev, curr_ctx->fh.m2m_ctx);
>>>> +     return IRQ_HANDLED;
>>>> +
>>>>  exit_unlock:
>>>>       spin_unlock(&jpeg->slock);
>>>>       return IRQ_HANDLED;
>>>>
>>>
>>> Acked-by: Jacek Anaszewski <jacek.anaszewski@...il.com>
>>>
>>> Just out of curiosity - could you share how you discovered the problem -
>>> by some static checkers or trying to use the driver?
>>
>> We discovered this issue after adding a new unit test for the jpeg
>> codec in Chromium OS:
>>
>> https://bugs.chromium.org/p/chromium/issues/detail?id=705971
>>
>>>>From what I understand the test spawns different processes that access
>> the codec device concurrently, creating the situation leading to the
>> bug.
>
> Thanks for the explanation. Nice fix.

Gentle ping as I am not seeing this patch in the tree yet. Thanks.

>
>> On a slightly related note, I was thinking whether it would make sense
>> to move the call to  v4l2_m2m_job_finish() (and maybe other parts of
>> the current interrupt handler) into a worker or a threaded interrupt
>> handler so as to reduce the time we spend with interrupts disabled.
>> Can I have your input on this idea?
>
> Right, all remaining drivers call it from workers.
> Feel free to submit a patch.
>
> --
> Best regards,
> Jacek Anaszewski

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ