linux-kernel - Re: [PATCH v6 2/2] i2c: i2c-qcom-geni: Add Block event interrupt support

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAO9ioeVuAO6mYpBSpiTW0jhFRPtkubZ5eEskd1yLBHVdR8_YMA@mail.gmail.com>
Date: Tue, 17 Jun 2025 22:32:44 +0300
From: Dmitry Baryshkov <dmitry.baryshkov@....qualcomm.com>
To: Jyothi Kumar Seerapu <quic_jseerapu@...cinc.com>
Cc: Vinod Koul <vkoul@...nel.org>,
        Mukesh Kumar Savaliya <quic_msavaliy@...cinc.com>,
        Viken Dadhaniya <quic_vdadhani@...cinc.com>,
        Andi Shyti <andi.shyti@...nel.org>,
        Sumit Semwal <sumit.semwal@...aro.org>,
        Christian König <christian.koenig@....com>,
        linux-arm-msm@...r.kernel.org, dmaengine@...r.kernel.org,
        linux-kernel@...r.kernel.org, linux-i2c@...r.kernel.org,
        linux-media@...r.kernel.org, dri-devel@...ts.freedesktop.org,
        linaro-mm-sig@...ts.linaro.org, quic_vtanuku@...cinc.com
Subject: Re: [PATCH v6 2/2] i2c: i2c-qcom-geni: Add Block event interrupt support

On Tue, 17 Jun 2025 at 17:11, Jyothi Kumar Seerapu
<quic_jseerapu@...cinc.com> wrote:
>
>
>
> On 5/30/2025 10:12 PM, Dmitry Baryshkov wrote:
> > On Fri, May 30, 2025 at 07:36:05PM +0530, Jyothi Kumar Seerapu wrote:
> >>
> >>
> >> On 5/21/2025 6:15 PM, Dmitry Baryshkov wrote:
> >>> On Wed, May 21, 2025 at 03:58:48PM +0530, Jyothi Kumar Seerapu wrote:
> >>>>
> >>>>
> >>>> On 5/9/2025 9:31 PM, Dmitry Baryshkov wrote:
> >>>>> On 09/05/2025 09:18, Jyothi Kumar Seerapu wrote:
> >>>>>> Hi Dimitry, Thanks for providing the review comments.
> >>>>>>
> >>>>>> On 5/6/2025 5:16 PM, Dmitry Baryshkov wrote:
> >>>>>>> On Tue, May 06, 2025 at 04:48:44PM +0530, Jyothi Kumar Seerapu wrote:
> >>>>>>>> The I2C driver gets an interrupt upon transfer completion.
> >>>>>>>> When handling multiple messages in a single transfer, this
> >>>>>>>> results in N interrupts for N messages, leading to significant
> >>>>>>>> software interrupt latency.
> >>>>>>>>
> >>>>>>>> To mitigate this latency, utilize Block Event Interrupt (BEI)
> >>>>>>>> mechanism. Enabling BEI instructs the hardware to prevent interrupt
> >>>>>>>> generation and BEI is disabled when an interrupt is necessary.
> >>>>>>>>
> >>>>>>>> Large I2C transfer can be divided into chunks of 8 messages internally.
> >>>>>>>> Interrupts are not expected for the first 7 message completions, only
> >>>>>>>> the last message triggers an interrupt, indicating the completion of
> >>>>>>>> 8 messages. This BEI mechanism enhances overall transfer efficiency.
> >>>>>>>
> >>>>>>> Why do you need this complexity? Is it possible to set the
> >>>>>>> DMA_PREP_INTERRUPT flag on the last message in the transfer?
> >>>>>>
> >>>>>> If i undertsand correctly, the suggestion is to get the single
> >>>>>> intetrrupt for last i2c message only.
> >>>>>>
> >>>>>> But With this approach, we can't handle large number of i2c messages
> >>>>>> in the transfer.
> >>>>>>
> >>>>>> In GPI driver, number of max TREs support is harcoded to 64 (#define
> >>>>>> CHAN_TRES   64) and for I2C message, we need Config TRE, GO TRE and
> >>>>>> DMA TREs. So, the avilable TREs are not sufficient to handle all the
> >>>>>> N messages.
> >>>>>
> >>>>> It sounds like a DMA driver issue. In other words, the DMA driver can
> >>>>> know that it must issue an interrupt before exausting 64 TREs in order
> >>>>> to
> >>>>>
> >>>>>>
> >>>>>> Here, the plan is to queue i2c messages (QCOM_I2C_GPI_MAX_NUM_MSGS
> >>>>>> or 'num' incase for less messsages), process and unmap/free upon the
> >>>>>> interrupt based on QCOM_I2C_GPI_NUM_MSGS_PER_IRQ.
> >>>>>
> >>>>> Why? This is some random value which has no connection with CHAN_TREs.
> >>>>> Also, what if one of the platforms get a 'liter' GPI which supports less
> >>>>> TREs in a single run? Or a super-premium platform which can use 256
> >>>>> TREs? Please don't workaround issues from one driver in another one.
> >>>>
> >>>> We are trying to utilize the existing CHAN_TRES mentioned in the GPI driver.
> >>>> With the following approach, the GPI hardware can process N number of I2C
> >>>> messages, thereby improving throughput and transfer efficiency.
> >>>>
> >>>> The main design consideration for using the block event interrupt is as
> >>>> follows:
> >>>>
> >>>> Allow the hardware to process the TREs (I2C messages), while the software
> >>>> concurrently prepares the next set of TREs to be submitted to the hardware.
> >>>> Once the TREs are processed, they can be freed, enabling the software to
> >>>> queue new TREs. This approach enhances overall optimization.
> >>>>
> >>>> Please let me know if you have any questions, concerns, or suggestions.
> >>>
> >>> The question was why do you limit that to QCOM_I2C_GPI_NUM_MSGS_PER_IRQ.
> >>> What is the reason for that limit, etc. If you think about it, The GENI
> >>> / I2C doesn't impose any limit on the number of messages processed in
> >>> one go (if I understand it correctly). Instead the limit comes from the
> >>> GPI DMA driver. As such, please don't add extra 'handling' to the I2C
> >>> driver. Make GPI DMA driver responsible for saying 'no more for now',
> >>> then I2C driver can setup add an interrupt flag and proceed with
> >>> submitting next messages, etc.
> >>>
> >>
> >> For I2C messages, we need to prepare TREs for Config, Go and DMAs. However,
> >> if a large number of I2C messages are submitted then may may run out of
> >> memory for serving the TREs. The GPI channel supports a maximum of 64 TREs,
> >> which is insufficient to serve 32 or even 16 I2C messages concurrently,
> >> given the multiple TREs required per message.
> >>
> >> To address this limitation, a strategy has been implemented to manage how
> >> many messages can be queued and how memory is recycled. The constant
> >> QCOM_I2C_GPI_MAX_NUM_MSGS is set to 16, defining the upper limit of
> >> messages that can be queued at once. Additionally,
> >> QCOM_I2C_GPI_NUM_MSGS_PER_IRQ is set to 8, meaning that
> >> half of the queued messages are expected to be freed or deallocated per
> >> interrupt.
> >> This approach ensures that the driver can efficiently manage TRE resources
> >> and continue queuing new I2C messages without exhausting memory.
> >>> I really don't see a reason for additional complicated handling in the
> >>> geni driver that you've implemented. Maybe I misunderstand something. In
> >>> such a case it usually means that you have to explain the design in the
> >>> commit message / in-code comments.
> >>>
> >>
> >>
> >> The I2C Geni driver is designed to prepare and submit descriptors to the GPI
> >> driver one message at a time.
> >> As a result, the GPI driver does not have visibility into the current
> >> message index or the total number of I2C messages in a transfer. This lack
> >> of context makes it challenging to determine when to set the block event
> >> interrupt, which is typically used to signal the completion of a batch of
> >> messages.
> >>
> >> So, the responsibility for deciding when to set the BEI should lie with the
> >> I2C driver.
> >>
> >> If this approach is acceptable, I will proceed with updating the relevant
> >> details in the commit message.
> >>
> >> Please let me know if you have any concerns or suggestions.
> >
> Hi Dmitry, Sorry for the delayed response, and thank you for the
> suggestions.
>
> > - Make gpi_prep_slave_sg() return NULL if flags don't have
> >    DMA_PREP_INTERRUPT flag and there are no 3 empty TREs for the
> >    interrupt-enabled transfer.
> "there are no 3 empty TREs for the interrupt-enabled transfer."
> Could you please help me understand this a bit better?

In the GPI driver you know how many TREs are available. In
gpi_prep_slave_sg() you can check that and return an error if there
are not enough TREs available.

> >
> > - If I2C driver gets NULL from dmaengine_prep_slave_single(), retry
> >    again, adding DMA_PREP_INTERRUPT. Make sure that the last one always
> >    gets DMA_PREP_INTERRUPT.
> Does this mean we need to proceed to the next I2C message and ensure
> that the DMA_PREP_INTERRUPT flag is set for the last I2C message in each
> chunk? And then, should we submit the chunk of messages to the GSI
> hardware for processing?

No. You don't have to peek at the next I2C message. This all concerns
the current I2C message. The only point where you have to worry is to
explicitly set the flag for the last message.

>
> >
> > - In geni_i2c_gpi_xfer() split the loop to submit messages until you
> >    can, then call wait_for_completion_timeout() and then
> >    geni_i2c_gpi_unmap() for submitted messages, then continue with a new
> >    portion of messages.
> Since the GPI channel supports a maximum of 64 TREs, should we consider
> submitting a smaller number of predefined messages — perhaps fewer than
> 32, such as 16?

Why? Just submit messages until they fit, then flush the DMA async channel.

> This is because handling 32 messages would require one TRE for config
> and 64 TREs for the Go and DMA preparation steps, which exceeds the
> channel's TRE capacity of 64.
>
> We designed the approach to submit a portion of the messages — for
> example, 16 at a time. Once 8 messages are processed and freed, the
> hardware can continue processing the TREs, while the software
> simultaneously prepares the next set of TREs. This parallelism helps in
> efficiently utilizing the hardware and enhances overall system
> optimization.


And this overcomplicates the driver and introduces artificial
limitations which need explanation. Please fix it in a simple way
first. Then you can e.g. implement the watermark at the half of the
GPI channel depth and request DMA_PREP_INTERRUPT to be set in the
middle of the full sequence, allowing it to be used asynchronously in
the background.

-- 
With best wishes
Dmitry