[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <878s6iatdf.fsf@nanos.tec.linutronix.de>
Date: Sat, 20 Mar 2021 13:42:52 +0100
From: Thomas Gleixner <tglx@...utronix.de>
To: Fenghua Yu <fenghua.yu@...el.com>
Cc: Ingo Molnar <mingo@...hat.com>, Borislav Petkov <bp@...en8.de>,
Peter Zijlstra <peterz@...radead.org>,
Tony Luck <tony.luck@...el.com>,
Randy Dunlap <rdunlap@...radead.org>,
Xiaoyao Li <xiaoyao.li@...el.com>,
Ravi V Shankar <ravi.v.shankar@...el.com>,
linux-kernel <linux-kernel@...r.kernel.org>, x86 <x86@...nel.org>
Subject: Re: [PATCH v5 2/3] x86/bus_lock: Handle #DB for bus lock
On Fri, Mar 19 2021 at 22:19, Fenghua Yu wrote:
> On Fri, Mar 19, 2021 at 10:30:50PM +0100, Thomas Gleixner wrote:
>> > + if (sscanf(arg, "ratelimit:%d", &ratelimit) == 1 && ratelimit > 0) {
>> > + bld_ratelimit = ratelimit;
>>
>> So any rate up to INTMAX/s is valid here, right?
>
> Yes. I don't see smaller limitation than INTMX/s. Is that right?
That's a given, but what's the point of limits in that range?
A buslock access locks up the system for X cycles. So the total amount
of allowable damage in cycles per second is:
limit * stall_cycles_per_bus_lock
ergo the time (in seconds) which the system is locked up is:
limit * stall_cycles_per_bus_lock / cpufreq
Which means for ~INTMAX/2 on a 2 GHz CPU:
2 * 10^9 * $CYCLES / 2 * 10^9 = $CYCLES seconds
Assumed the inflicted damage is only 1 cycle then #LOCK is pretty much
permanently on if there are enough threads. Sure #DB will slow them
down, but it still does not make any sense at all especially as the
damage is certainly greater than a single cycle.
And because the changelogs and the docs are void of numbers I just got
real numbers myself.
With a single thread doing a 'lock inc *mem' accross a cache line
boundary the workload which I measured with perf stat goes from:
5,940,985,091 instructions # 0.88 insn per cycle
2.780950806 seconds time elapsed
0.998480000 seconds user
4.202137000 seconds sys
to
7,467,979,504 instructions # 0.10 insn per cycle
5.110795917 seconds time elapsed
7.123499000 seconds user
37.266852000 seconds sys
The buslock injection rate is ~250k per second.
Even if I ratelimit the locked inc by a delay loop of ~5000 cycles
which is probably more than what the #DB will cost then this single task
still impacts the workload significantly:
6,496,994,537 instructions # 0.39 insn per cycle
3.043275473 seconds time elapsed
1.899852000 seconds user
8.957088000 seconds sys
The buslock injection rate is down to ~150k per second in this case.
And even with throttling the injection rate further down to 25k per
second the impact on the workload is still significant in the 10% range.
And of course the documentation of the ratelimit parameter explains all
of this in great detail so the administrator has a trivial job to tune
that, right?
>> > + case sld_ratelimit:
>> > + /* Enforce no more than bld_ratelimit bus locks/sec. */
>> > + while (!__ratelimit(&get_current_user()->bld_ratelimit))
>> > + msleep(1000 / bld_ratelimit);
For any ratelimit > 1000 this will loop up to 1000 times with
CONFIG_HZ=1000.
Assume that the buslock producer has tons of threads which all end up
here pretty soon then you launch a mass wakeup in the worst case every
jiffy. Are you sure that the cure is better than the disease?
> If I split this whole patch set into two patch sets:
> 1. Three patches in the first patch set: the enumeration patch, the warn
> and fatal patch, and the documentation patch.
> 2. Two patches in the second patch set: the ratelimit patch and the
> documentation patch.
>
> Then I will send the two patch sets separately, you will accept them one
> by one. Is that OK?
That's obviously the right thing to do because #1 should be ready and we
can sort out #2 seperately. See the conversation with Tony.
Thanks,
tglx
Powered by blists - more mailing lists