[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <e32a07c7-18ff-49a4-98bf-dda10ab69f16@linux.dev>
Date: Thu, 19 Dec 2024 18:39:26 +0800
From: Sui Jingfeng <sui.jingfeng@...ux.dev>
To: Icenowy Zheng <uwu@...nowy.me>, Xi Ruoyao <xry111@...111.site>,
WANG Xuerui <kernel@...0n.name>, Huacai Chen <chenhuacai@...nel.org>
Cc: Andrew Morton <akpm@...ux-foundation.org>,
"Mike Rapoport (IBM)" <rppt@...nel.org>, Baoquan He <bhe@...hat.com>,
"Matthew Wilcox (Oracle)" <willy@...radead.org>,
David Hildenbrand <david@...hat.com>, Zhen Lei <thunder.leizhen@...wei.com>,
Thomas Gleixner <tglx@...utronix.de>, Zhihong Dong <donmor3000@...mail.com>,
loongarch@...ts.linux.dev, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v2] loongarch/mm: disable WUC for pgprot_writecombine as
same as ioremap_wc
On 2024/12/19 14:38, Icenowy Zheng wrote:
> 在 2024-12-19星期四的 13:49 +0800,Sui Jingfeng写道:
>> On 2024/12/19 12:49, Icenowy Zheng wrote:
>>> 在 2024-12-19星期四的 10:54 +0800,Sui Jingfeng写道:
>>>> On 2024/12/18 20:43, Icenowy Zheng wrote:
>>>>> For the fact of drm/ast's dramatical drop, it's because write
>>>>> to
>>>>> the
>>>>> framebuffer can no longer be reordered.
>>>> No, your understanding is wrong, very very wrong and a big wrong.
>>>>
>>>> It's not because it can't reorder the write. Rather, it's because
>>>> that the CPU can't do write gathering and can't do burst write
>>>> any
>>>> more.
>>> Write gathering is a kind of write reordering,
>>
>> No, your understanding is broken.
>>
>> Write gathering *isn't* a kind of write reordering.
> It is, it changes the order "write A -> write B -> write C -> write D"
> to "write ABCD concurrently".
The reorder mentioned here isn't the main reason that
affect the performance. While the cache-like behavior
and better bandwidth utilizing (burst write) is.
> If one of B/C/D is a register that triggers latching A
Mips/Loonarch CPUs doesn't allow *uncached read* bypass writes.
This means that when you issue a *uncached read*, the former
write operation must have been resolved by the hardware memory.
But this is true only for *uncached read* issued by the *CPU*.
How can the "write B", "write C" and "write D" will trigger the
latching A here?
> in the former case it will latch A correctly but
> in the latter case it will wrongly latch the old value of A instead, so
> write gathering is not strongly-ordered.
>
>
For accesses from the CPU side, registers are mapped with *uncached*.
register access by the CPU are all strong ordered.
All DRM drivers mapped their register with strong order uncached fashion.
Do you ever seen any exceptions?
Even with the command submit approach, registers will not get written to
the hardware until the kickoff command is issued to the hardware.
The write order depend on the occurrence order in the ring buffer,
not the issue order. Commands that rank first in the ring buffer
will get executed first. But there is still no hints that
"write B", "write C" and "write D" will lead to "latch A",
So please stop cheating us by making up cock-and-bull story.
>> Its doesn't have to reorder, it just cache the write operation with
>> the CPU's write buffer.
>>
>>
>>> comparing to strongly
>>> ordered writing (which is literally one byte per write).
>>>
>>>> So do you still think your patch is harmless?
>>> Well, I said that performance w/o correctness is meaningless.
>>
>> The point is that Write-Combine on drm/ast will get both correctness
>> and performance.
>>
>>
--
Best regards,
Sui
Powered by blists - more mailing lists