[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <c72402f2-967e-cd56-99d8-9139c9e7f267@google.com>
Date: Fri, 1 Mar 2019 16:27:54 -0500
From: Barret Rhoden <brho@...gle.com>
To: Dennis Zhou <dennis@...nel.org>
Cc: Eial Czerwacki <eial@...lemp.com>, tj@...nel.org, cl@...ux.com,
linux-kernel@...r.kernel.org, Shai Fultheim <shai@...lemp.com>,
Oren Twaig <oren@...lemp.com>,
"Paul E. McKenney" <paulmck@...ux.ibm.com>
Subject: Re: [PATCH] percpu/module resevation: change resevation size iff
X86_VSMP is set
Hi -
On 03/01/2019 03:34 PM, Dennis Zhou wrote:
> Hi Barret,
>
> On Fri, Mar 01, 2019 at 01:30:15PM -0500, Barret Rhoden wrote:
>> Hi -
>>
>> On 01/21/2019 06:47 AM, Eial Czerwacki wrote:
>>>
>>
>> Your main issue was that you only sent this patch to LKML, but not the
>> maintainers of the file. If you don't, your patch might get lost. To get
>> the appropriate people and lists, run:
>>
>> scripts/get_maintainer.pl YOUR_PATCH.patch.
>>
>> For this patch, you'll get this:
>>
>> Dennis Zhou <dennis@...nel.org> (maintainer:PER-CPU MEMORY ALLOCATOR)
>> Tejun Heo <tj@...nel.org> (maintainer:PER-CPU MEMORY ALLOCATOR)
>> Christoph Lameter <cl@...ux.com> (maintainer:PER-CPU MEMORY ALLOCATOR)
>> linux-kernel@...r.kernel.org (open list)
>>
>> I added the three maintainers to this email.
>>
>> I have a few minor comments below.
>>
>>> [PATCH] percpu/module resevation: change resevation size iff X86_VSMP is
>> set
>>
>> You misspelled 'reservation'. Also, I'd just say: "percpu: increase module
>> reservation size if X86_VSMP is set". ('change' -> 'increase'), only says
>> 'reservation' once.)
>>
>>> as reported in bug #201339 (https://bugzilla.kernel.org/show_bug.cgi?id=201339)
>>
>> I think you can add a tag for this right above your Signed-off-by tags.
>> e.g.:
>>
>> Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=201339
>>
>>> by enabling X86_VSMP, INTERNODE_CACHE_BYTES's definition differs from the default one
>>> causing the struct size to exceed the size ok 8KB.
>> ^of
>>
>> Which struct are you talking about? I have one in mind, but others might
>> not know from reading the commit message.
>>
>> I ran into this on https://bugzilla.kernel.org/show_bug.cgi?id=202511. In
>> that case, it was because modules (drm and amdkfd) were using DEFINE_SRCU,
>> which does a DEFINE_PER_CPU on struct srcu_data, and that used
>> ____cacheline_internodealigned_in_smp.
>>
>>>
>>> in order to avoid such issue, increse PERCPU_MODULE_RESERVE to 64KB if CONFIG_X86_VSMP is set.
>> ^increase
>>
>>>
>>> the value was caculated on linux 4.20.3, make allmodconfig all and the following oneliner:
>> ^calculated
>>
>>> for f in `find -name *.ko`; do echo $f; readelf -S $f |grep perc; done |grep data..percpu -B 1 |grep ko |while read r; do echo -n "$r: "; objdump --syms --section=.data..percpu $r|grep data |sort -n |awk '{c++; d=strtonum("0x" $1) + strtonum("0x" $5); if (m < d) m = d;} END {printf("%d vars-> last addr 0x%x ( %d )\n", c, m, m)}' ; done |column -t |sort -k 8 -n | awk '{print $8}'| paste -sd+ | bc
>>
>> Not sure how useful the one-liner is, versus a description of what you're
>> doing. i.e. "the size of all module percpu data sections, or something."
>>
>> Also, how close was that calculated value to 64K? If more modules start
>> using DEFINE_SRCU, each of which uses 8K, then that 64K might run out.
>>
>> Thanks,
>> Barret
>>
>>> Signed-off-by: Eial Czerwacki <eial@...lemp.com>
>>> Signed-off-by: Shai Fultheim <shai@...lemp.com>
>>> Signed-off-by: Oren Twaig <oren@...lemp.com>
>>> ---
>>> include/linux/percpu.h | 4 ++++
>>> 1 file changed, 4 insertions(+)
>>>
>>> diff --git a/include/linux/percpu.h b/include/linux/percpu.h
>>> index 70b7123..6b79693 100644
>>> --- a/include/linux/percpu.h
>>> +++ b/include/linux/percpu.h
>>> @@ -14,7 +14,11 @@
>>> /* enough to cover all DEFINE_PER_CPUs in modules */
>>> #ifdef CONFIG_MODULES
>>> +#ifdef X86_VSMP
>>> +#define PERCPU_MODULE_RESERVE (1 << 16)
>>> +#else
>>> #define PERCPU_MODULE_RESERVE (8 << 10)
>>> +#endif
>>> #else
>>> #define PERCPU_MODULE_RESERVE 0
>>> #endif
>>>
>>
>
> Thanks for sending this to me.
>
> I must say, I really do not want to expand the reserved region. In most
> cases, it can easily end up unused and thus wasted memory as it is hard
> allocated on boot. This is done because code gen assumes static
> variables are close to the program counter. This would not be true with
> dynamic allocations which being at the end of the vmalloc area
> (Summarized from Tejun's account in [1]).
>
> Another note on the reserved region. It starts at the end of the static
> region which means it generally isn't page aligned. So while an 8kb
> allocation would fit, a 4kb alignment more than likely would fail.
> Something as large as 8kb should probably be dynamically allocated as
> well.
>
> I read through the bugzilla report and it seems that the culprits are:
> drivers/gpu/drm/amd/amdkfd/kfd_process.c:DEFINE_SRCU(kfd_processes_srcu);
> drivers/gpu/drm/drm_drv.c:DEFINE_STATIC_SRCU(drm_unplug_srcu);
>
> Is there a reason we cannot dynamically initialize these structs? I've
> cced Paul McKenney because we saw an issue with ipmi in December [1].
I looked at the AMD driver, and it looks like they could dynamically
initialize it. It would require a little extra plumbing. I imagine the
DRM one is the same way.
To catch this in the future, should we disallow DEFINE_SRCU in modules
or something? Otherwise, this will pop up again the next time someone
uses DEFINE_SRCU in a module and builds with CONFIG_X86_VSMP.
That might be a little much, and it still won't be sufficient to catch
all cases. This will also come up any time a module has a static
per-cpu data structure that uses __cacheline_aligned_in_smp, so it's not
limited to SRCU either.
I'm not familiar with VSMP - how bad is it to use L1 cache alignment
instead of 4K page alignment? Maybe some structures can use the smaller
alignment? Or maybe have VSMP require SRCU-using modules to be built-in?
Thanks,
Barret
Powered by blists - more mailing lists