lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  PHC 
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Fri, 1 Mar 2019 16:27:54 -0500
From:   Barret Rhoden <>
To:     Dennis Zhou <>
Cc:     Eial Czerwacki <>,,,, Shai Fultheim <>,
        Oren Twaig <>,
        "Paul E. McKenney" <>
Subject: Re: [PATCH] percpu/module resevation: change resevation size iff
 X86_VSMP is set

Hi -

On 03/01/2019 03:34 PM, Dennis Zhou wrote:
> Hi Barret,
> On Fri, Mar 01, 2019 at 01:30:15PM -0500, Barret Rhoden wrote:
>> Hi -
>> On 01/21/2019 06:47 AM, Eial Czerwacki wrote:
>> Your main issue was that you only sent this patch to LKML, but not the
>> maintainers of the file.  If you don't, your patch might get lost.  To get
>> the appropriate people and lists, run:
>> 	scripts/ YOUR_PATCH.patch.
>> For this patch, you'll get this:
>> Dennis Zhou <> (maintainer:PER-CPU MEMORY ALLOCATOR)
>> Tejun Heo <> (maintainer:PER-CPU MEMORY ALLOCATOR)
>> Christoph Lameter <> (maintainer:PER-CPU MEMORY ALLOCATOR)
>> (open list)
>> I added the three maintainers to this email.
>> I have a few minor comments below.
>>> [PATCH] percpu/module resevation: change resevation size iff X86_VSMP is
>> set
>> You misspelled 'reservation'.  Also, I'd just say: "percpu: increase module
>> reservation size if X86_VSMP is set".  ('change' -> 'increase'), only says
>> 'reservation' once.)
>>> as reported in bug #201339 (
>> I think you can add a tag for this right above your Signed-off-by tags.
>> e.g.:
>> Bugzilla:
>>> by enabling X86_VSMP, INTERNODE_CACHE_BYTES's definition differs from the default one
>>> causing the struct size to exceed the size ok 8KB.
>>                                              ^of
>> Which struct are you talking about?  I have one in mind, but others might
>> not know from reading the commit message.
>> I ran into this on In
>> that case, it was because modules (drm and amdkfd) were using DEFINE_SRCU,
>> which does a DEFINE_PER_CPU on struct srcu_data, and that used
>> ____cacheline_internodealigned_in_smp.
>>> in order to avoid such issue, increse PERCPU_MODULE_RESERVE to 64KB if CONFIG_X86_VSMP is set.
>>                                 ^increase
>>> the value was caculated on linux 4.20.3, make allmodconfig all and the following oneliner:
>>                 ^calculated
>>> for f in `find -name *.ko`; do echo $f; readelf -S $f  |grep perc; done |grep data..percpu -B 1 |grep ko |while read r; do echo -n "$r: "; objdump --syms $r|grep data |sort -n  |awk '{c++; d=strtonum("0x" $1) + strtonum("0x" $5); if (m < d) m = d;} END {printf("%d vars-> last addr 0x%x ( %d )\n", c, m, m)}' ; done |column -t |sort -k 8 -n | awk '{print $8}'| paste -sd+ | bc
>> Not sure how useful the one-liner is, versus a description of what you're
>> doing.  i.e. "the size of all module percpu data sections, or something."
>> Also, how close was that calculated value to 64K?  If more modules start
>> using DEFINE_SRCU, each of which uses 8K, then that 64K might run out.
>> Thanks,
>> Barret
>>> Signed-off-by: Eial Czerwacki <>
>>> Signed-off-by: Shai Fultheim <>
>>> Signed-off-by: Oren Twaig <>
>>> ---
>>>    include/linux/percpu.h | 4 ++++
>>>    1 file changed, 4 insertions(+)
>>> diff --git a/include/linux/percpu.h b/include/linux/percpu.h
>>> index 70b7123..6b79693 100644
>>> --- a/include/linux/percpu.h
>>> +++ b/include/linux/percpu.h
>>> @@ -14,7 +14,11 @@
>>>    /* enough to cover all DEFINE_PER_CPUs in modules */
>>>    #ifdef CONFIG_MODULES
>>> +#ifdef X86_VSMP
>>> +#define PERCPU_MODULE_RESERVE		(1 << 16)
>>> +#else
>>>    #define PERCPU_MODULE_RESERVE		(8 << 10)
>>> +#endif
>>>    #else
>>>    #define PERCPU_MODULE_RESERVE		0
>>>    #endif
> Thanks for sending this to me.
> I must say, I really do not want to expand the reserved region. In most
> cases, it can easily end up unused and thus wasted memory as it is hard
> allocated on boot. This is done because code gen assumes static
> variables are close to the program counter. This would not be true with
> dynamic allocations which being at the end of the vmalloc area
> (Summarized from Tejun's account in [1]).
> Another note on the reserved region. It starts at the end of the static
> region which means it generally isn't page aligned. So while an 8kb
> allocation would fit, a 4kb alignment more than likely would fail.
> Something as large as 8kb should probably be dynamically allocated as
> well.
> I read through the bugzilla report and it seems that the culprits are:
>    drivers/gpu/drm/amd/amdkfd/kfd_process.c:DEFINE_SRCU(kfd_processes_srcu);
>    drivers/gpu/drm/drm_drv.c:DEFINE_STATIC_SRCU(drm_unplug_srcu);
> Is there a reason we cannot dynamically initialize these structs? I've
> cced Paul McKenney because we saw an issue with ipmi in December [1].

I looked at the AMD driver, and it looks like they could dynamically 
initialize it.  It would require a little extra plumbing.  I imagine the 
DRM one is the same way.

To catch this in the future, should we disallow DEFINE_SRCU in modules 
or something?  Otherwise, this will pop up again the next time someone 
uses DEFINE_SRCU in a module and builds with CONFIG_X86_VSMP.

That might be a little much, and it still won't be sufficient to catch 
all cases.  This will also come up any time a module has a static 
per-cpu data structure that uses __cacheline_aligned_in_smp, so it's not 
limited to SRCU either.

I'm not familiar with VSMP - how bad is it to use L1 cache alignment 
instead of 4K page alignment?  Maybe some structures can use the smaller 
alignment?  Or maybe have VSMP require SRCU-using modules to be built-in?



Powered by blists - more mailing lists