lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <19109672-2856-457f-b1f6-305abc6c4434@linaro.org>
Date: Sun, 2 Mar 2025 15:56:04 +0000
From: Bryan O'Donoghue <bryan.odonoghue@...aro.org>
To: Vedang Nagar <quic_vnagar@...cinc.com>,
 Stanimir Varbanov <stanimir.k.varbanov@...il.com>,
 Vikash Garodia <quic_vgarodia@...cinc.com>,
 Mauro Carvalho Chehab <mchehab@...nel.org>
Cc: linux-media@...r.kernel.org, linux-arm-msm@...r.kernel.org,
 linux-kernel@...r.kernel.org
Subject: Re: [PATCH v2 0/2] venus driver fixes to avoid possible OOB read
 access

On 02/03/2025 11:58, Vedang Nagar wrote:
>>
>> The basic question : what is the lifetime of the data from RX 
>> interrupt to consumption by another system agent, DSP, userspace, 
>> whatever ?
> As mentioned in [1], With the regular firmware, after RX interrupt the 
> data can be considered as valid until next interrupt is raised, but with 
> the rouge firmware, data can get invalid during the second read and our 
> intention is to avoid out of bound access read because of such issues.

This is definitely the part I don't compute.

1. RX interrupt
2. Frame#0 Some amount of time data is always valid
3. RX interrupt - new data
4. Frame#1 new data delivered into a buffer

Are you describing a case between RX interrupts 1-3 or a case after 1-4?

Why do we need to write code for rouge firmware anyway ?

And the real question - if the data can be invalidated in the 1-3 window 
above when is the safe time to snapshot that data ?

We seem to have alot of submissions to deal with 'rouge' firmware 
without I think properly describing the problem of the _expected_ data 
lifetime.

So

a) What is the expected data lifetime of an RX buffer between one
    RX IRQ and the next ?
    I hope the answer to this is - APSS owns the buffer.
    This is BTW usually the case in these types of asymmetric setups
    with a flag or some other kind of semaphore that indicates which
    side of the data-exchange owns the buffer.

b) In this rouge - buggy - firmware case what is the scope of the
    potential race condition ?

    What I'd really like to know here is why we have to seemingly
    memcpy() again and again in seemingly incongrous and not
    immediately obvious places in the code.

    Would we not be better advised to do a memcpy() of the entire
    RX frame in the RX IRQ handler path if as you appear to me
    suggesting - the firmware can "race" with the APSS
    i.e. the data-buffer ownership flag either doesn't work
    or isn't respected by one side in the data-exchange.

Can we please have a detailed description of the race condition here ?

I don't doubt the new memcpy() makes sense to you but without this 
detailed understanding of the underlying problem its virtually 
impossible to debate the appropriate remediation - perhaps this patch 
you've submitted - or some other solution.

Sorry to dig into my trench here but, way more detail is needed.

> [1]: https://lore.kernel.org/lkml/4cfc1fe1-2fab-4256-9ce2- 
> b4a0aad1069e@...aro.org/T/#m5f1737b16e68f8b8fc1d75517356b6566d0ec619
>>
>> Why is it in this small specific window that the data can change but 
>> not later ? What is the mechanism the data can change and how do the 
>> changes you propose here address the data lifetime problem ?
> Currently this issue has been discovered by external researchers at this 
> point, but if any such OOB issue is discovered at later point as well 
> then we shall fix them as well.

Right but, I'm looking for a detailed description of the problem.

Can you describe from RX interrupt again what the expected data lifetime 
of the RX frame is, which I hope we agree is until the next RX interrupt 
associated with a given buffer with an ownership flag shared between 
firmware and APSS - and then under what circumstances that "software 
contract" is being violated.

> Also, with rougue firmware we cannot fix the data lifetime problem in my 
> opinion, but atleast we can fix the out of bound issues.
>>
>> Without that context, I don't believe it is really possible to 
>> validate an additional memcpy() here and there in the code as fixing 
>> anything.
> There is no additional memcpy() now in the v2 patch, but as part of the 
> fix, we are just trying to retain the length of the packet which was 
> being read in the first memcpy() to avoid the OOB read access.

I can't make a suggestion because - personally speaking I still don't 
quite understand the data-race you are describing.

I get that you say the firmware is breaking the contract but, without 
more detail on _how_ it breaks that contract I don't think it's really 
possible to validate your fix here, fixes anything.

---
bod

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ