[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <mhng-e1799ed5-c012-40a1-b730-c20627cdb3f9@palmer-ri-x1c9a>
Date: Thu, 01 Dec 2022 12:00:43 -0800 (PST)
From: Palmer Dabbelt <palmer@...osinc.com>
To: Andrea Parri <andrea@...osinc.com>
CC: jrtc27@...c27.com, guoren@...nel.org, jszhang@...nel.org,
linux-riscv@...ts.infradead.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v4] riscv: fix race when vmap stack overflow
On Wed, 30 Nov 2022 18:43:32 PST (-0800), Andrea Parri wrote:
>> >>> @@ -23,6 +23,7 @@
>> >>> #define REG_L __REG_SEL(ld, lw)
>> >>> #define REG_S __REG_SEL(sd, sw)
>> >>> #define REG_SC __REG_SEL(sc.d, sc.w)
>> >>> +#define REG_AMOSWAP_AQ __REG_SEL(amoswap.d.aq, amoswap.w.aq)
>> >> Below is the reason why I use the relax version here:
>> >> https://lore.kernel.org/all/CAJF2gTRAEX_jQ_w5H05dyafZzHq+P5j05TJ=C+v+OL__GQam4A@mail.gmail.com/T/#u
>> >
>> > Sorry, I hadn't seen that one. Adding Andrea. IMO the acquire/release pair is necessary here, with just relaxed the stack stores inside the lock could show up on the next hart trying to use the stack.
>>
>> I think what you really want is a *consume* barrier, and since you have
>> the data dependency between the amoswap and the memory accesses (and
>> this isn’t Alpha) you’re technically fine without the acquire, since
>> you’re writing assembly and have the data dependency as syntactic.
>> Though you may still want (need?) the acquire so loads/stores unrelated
>> to the stack pointer that happen later in program order get ordered
>> after the load of the new stack pointer in case there could be weird
>> issues *there*.
>
> Agreed.
>
> Just the fact that this is the 4th iteration of this discussion strongly
> suggests to stick to the acquire and these inline comments to me. ;)
I spent a little time last night trying to reason about the no-AQ
version and I think it might actually be correct: the AMOSWAP is on the
lock and SP is overwritten when loading up the actual stack so I don't
think that's enough alone, but the no-speculative-accesses rule might be
enough here. Also I think mabye none of that even matters, because the
same-address rules might bail us out due to the nature of stack
accesses.
That said, this is some complicated and subtle reasoning. The
performance here doesn't matter so I'm just going to err on the side of
caution, but if someone cares enough to come up with concrete reasoning
as to why the barrier isn't necessary I'll at least look at the patch
(though I'll probably gnumble the whole time, as I hate being tricked
into thinking).
That'd be for-next material anyway, so the yes-AQ verison is on fixes
beacuse there's a concrete breakage being fixed.
Powered by blists - more mailing lists