[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <e8b49c34-90a1-4610-b7cd-8eff1b1a312a@kernel.org>
Date: Mon, 4 Mar 2024 07:19:11 +0100
From: Jiri Slaby <jirislaby@...nel.org>
To: Rengarajan.S@...rochip.com, linux-serial@...r.kernel.org,
gregkh@...uxfoundation.org, Kumaravel.Thiagarajan@...rochip.com,
UNGLinuxDriver@...rochip.com, Tharunkumar.Pasumarthi@...rochip.com,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH v1 tty] 8250: microchip: pci1xxxx: Refactor TX Burst code
to use pre-existing APIs
On 04. 03. 24, 5:37, Rengarajan.S@...rochip.com wrote:
> Hi Jiri,
>
> On Fri, 2024-02-23 at 10:26 +0100, Jiri Slaby wrote:
>> EXTERNAL EMAIL: Do not click links or open attachments unless you
>> know the content is safe
>>
>> On 23. 02. 24, 10:21, Rengarajan.S@...rochip.com wrote:
>>> On Fri, 2024-02-23 at 07:08 +0100, Jiri Slaby wrote:
>>>> EXTERNAL EMAIL: Do not click links or open attachments unless you
>>>> know the content is safe
>>>>
>>>> On 22. 02. 24, 14:49, Rengarajan S wrote:
>>>>> Updated the TX Burst implementation by changing the circular
>>>>> buffer
>>>>> processing with the pre-existing APIs in kernel. Also updated
>>>>> conditional
>>>>> statements and alignment issues for better readability.
>>>>
>>>> Hi,
>>>>
>>>> so why are you keeping the nested double loop?
>>>>
>>>
>>> Hi, in order to differentiate Burst mode handling with byte mode
>>> had
>>> seperate loops for both. Since, having single while loop also does
>>> not
>>> align with rx implementation (where we have seperate handling for
>>> burst
>>> and byte) have retained the double loop.
>>
>> So obviously, align RX to a single loop if possible. The current TX
>> code
>> is very hard to follow and sort of unmaintainable (and buggy). And
>> IMO
>> it's unnecessary as I proposed [1]. And even if RX cannot be one
>> loop,
>> you still can make TX easy to read as the two need not be the same.
>>
>> [1]
>> https://lore.kernel.org/all/b8325c3f-bf5b-4c55-8dce-ef395edce251@kernel.org/
>
>
> while (data_empty_count) {
> cnt = CIRC_CNT_TO_END();
> if (!cnt)
> break;
> if (cnt < UART_BURST_SIZE || (tail & 3)) { // is_unaligned()
> writeb();
> cnt = 1;
> } else {
> writel()
> cnt = UART_BURST_SIZE;
> }
> uart_xmit_advance(cnt);
> data_empty_count -= cnt;
> }
>
> With the above implementation we are observing performance drop of 2
> Mbps at baud rate of 4 Mbps. The reason for this is the fact that for
> each iteration we are checking if the the data need to be processed via
> DWORDs or Bytes. The condition check for each iteration is causing the
> drop in performance.
Hi,
the check is by several orders of magnitude faster than the I/O proper.
So I don't think that's the root cause.
> With the previous implementation(with nested loops) the performance is
> found to be around 4 Mbps at baud rate of 4 Mbps. In that
> implementation we handle sending DWORDs continuosly until the transfer
> size < 4. Can you let us know any other alternatives for the above
> performance drop.
Could you attach the patch you are testing?
thanks,
--
js
suse labs
Powered by blists - more mailing lists