linux-kernel - Re: [PATCH v4 2/2] irq/irq-loongarch-ir:Add Redirect irqchip support

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <87jz52nrzo.ffs@tglx>
Date: Mon, 23 Jun 2025 22:05:31 +0200
From: Thomas Gleixner <tglx@...utronix.de>
To: Tianyang Zhang <zhangtianyang@...ngson.cn>, chenhuacai@...nel.org,
 kernel@...0n.name, corbet@....net, alexs@...nel.org, si.yanteng@...ux.dev,
 jiaxun.yang@...goat.com, peterz@...radead.org, wangliupu@...ngson.cn,
 lvjianmin@...ngson.cn, maobibo@...ngson.cn, siyanteng@...oftware.com.cn,
 gaosong@...ngson.cn, yangtiezhu@...ngson.cn
Cc: loongarch@...ts.linux.dev, linux-doc@...r.kernel.org,
 linux-kernel@...r.kernel.org
Subject: Re: [PATCH v4 2/2] irq/irq-loongarch-ir:Add Redirect irqchip support

On Mon, Jun 23 2025 at 17:33, Tianyang Zhang wrote:
> 在 2025/6/23 下午4:27, Thomas Gleixner 写道:
>> tail is the software managed part of the ringbuffer which is shared with
>> the hardware, right?
>>
>> So even if the compiler would be allowed to reevalutate tail after the
>> barrier (it is NOT), then tail would still contain the same value as
>> before, no?
>>
>> The wmb() is required to ensure that the hardware can observe the full
>> write of the command _before_ it can observe the update to the tail
>> index.
>>
>> Anything else is voodoo.
>>
>> Thanks,
>>
>>          tglx
>
> In my previous understanding, tail'value is actually a part of 'full 
> write of the command '.

Of course. The hardware observes the tail value. If it is not updated
then the command is not executed. But these are two distinct things:

The invalidate command is written to a location in the command buffer
which is determined by tail:

	inv_addr = (struct irde_inv_cmd *)(rqueue->base + tail * sizeof(struct irde_inv_cmd));
	memcpy(inv_addr, cmd, sizeof(struct irde_inv_cmd));

requeue::base points to an array of invalidate commands. The array
size is INVALID_QUEUE_SIZE. tail is the position in the array to which
the software writes the next command. tail is software managed and
written to a completely different location via write_queue_tail(...):

static phys_addr_t redirect_reg_base = 0x1fe00000;

#define REDIRECT_REG_BASE(reg, node) \
	(UNCACHE_BASE | redirect_reg_base | (u64)(node) << NODE_ADDRSPACE_SHIFT | (reg))
#define	redirect_reg_queue_tail(node)	REDIRECT_REG_BASE(LOONGARCH_IOCSR_REDIRECT_CQT, (node))
#define read_queue_tail(node)		(*((u32 *)(redirect_reg_queue_tail(node))))
#define write_queue_tail(node, val)	(*((u32 *)(redirect_reg_queue_tail(node))) = (val))

The hardware maintains the head index. It's the last command index the
hardware processed. When the hardware observes that head != tail then it
processes the next entry and after completion it updates head with the
next index. This repeats until head == tail.

> We must ensure that tail is updated to the correct value first, and
> then write this value into the register (perhaps by adding wmb in
> write_queue_tail ).

No. The local variable 'tail' is purely local to the CPU and the
invalidation hardware does not even know that it exists.

There are two things which are relevant to the hardware:

   1) command is written to the hardware visible array at index of tail

   2) hardware visible tail memory (register) is updated to tail + 1

The memory barrier is required to prevent that #2 is written to the
hardware _before_ #1 completed and is fully visible to the hardware.

> In other words, this is originally to prevent the write register 
> instruction from being executed out of order before updating tail.

No. The barrier is solely for the above #1 vs. #2 ordering.

There is a difference between program flow ordering and memory ordering.

The hardware _CANNOT_ execute the write _before_ the value it writes is
computed. That's enforced by program flow order.

So it's always guaranteed that the CPU executes

   tail + 1

_before_ executing the write to the register because that's a program
order dependency.

If that would not be guaranteed, then your CPU would have more serious
problems than this piece of code. It simply would not even survive the
first five instructions in the boot loader.

But what's not guaranteed is that consecutive writes become visible to
other parts of the system (other CPUs, the invalidation hardware, ....)
in the same consecutive order.

To ensure that ordering you need a WMB(). If you would have CPU to CPU
communication then you would need a RMB() on the reader side to ensure
that the command is not read before the tail. In your case the hardware
side takes care of that.

> The above is just my personal understanding

Which might need some slight adjustments :)

Thanks,

        tglx