linux-kernel - Re: [PATCH -tip/cpus4096-v2] cpumask: fix cpumask of call_function

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <490649AE.6050905@ct.jp.nec.com>
Date:	Mon, 27 Oct 2008 16:07:26 -0700
From:	Hiroshi Shimamoto <h-shimamoto@...jp.nec.com>
To:	Ingo Molnar <mingo@...e.hu>
Cc:	Rusty Russell <rusty@...tcorp.com.au>,
	Mike Travis <travis@....com>, linux-kernel@...r.kernel.org,
	Andrew Morton <akpm@...ux-foundation.org>
Subject: Re: [PATCH -tip/cpus4096-v2] cpumask: fix cpumask of	call_function_data

Ingo Molnar wrote:
> * Ingo Molnar <mingo@...e.hu> wrote:
> 
>> in any case, i've started testing tip/cpus4096-v2 again on x86 - the 
>> problem with d4de5a above was the only outstanding known issue, right?
> 
> the sched_init() slab corruption bug is still there, i just triggered it 
> on two separate test-systems:
> 
> [    0.510620] CPU1 attaching sched-domain:
> [    0.512007]  domain 0: span 0-1 level CPU
> [    0.517730]   groups: 1 0
> [    0.520528] =============================================================================
> [    0.524002] BUG kmalloc-8: Wrong object count. Counter is 11 but counted were 50
> [    0.524002] -----------------------------------------------------------------------------
> [    0.524002] 

Hm,

I think kmalloc-8 is too small.
In this case, struct cpumask is defined;

struct cpumask {
        DECLARE_BITMAP(bits, NR_CPUS);
};

So, storing cpumask such as cpu_core_map, cpu_sibling_map and sd->span etc.
requires NR_CPUS bits. In Ingo's config, it needs 4096 bits.

At alloc_cpumask_var uses cpumask_size() for kmalloc(),

bool alloc_cpumask_var(cpumask_var_t *mask, gfp_t flags)
{
        if (likely(slab_is_available()))
                *mask = kmalloc(cpumask_size(), flags);

cpumask_size() looks nr_cpumask_bits and it defined as follows;

#define nr_cpumask_bits nr_cpu_ids

it's CONFIG_NR_CPUS > BITS_PER_LONG case.
And now nr_cpu_ids is 2 on this boot log.

...
> [    0.000000] PERCPU: Allocating 1900544 bytes of per cpu data
> [    0.000000] NR_CPUS:4096 nr_cpumask_bits:2 nr_cpu_ids:2 nr_node_ids:1

So, kmalloc(8, flags) for cpumask_var_t at alloc_cpumask_var().
But the content is treated as cpumask_t, it causes slab corruption
with overwritten when the mask data is copied.

For example, cpu_to_core_group()

static int
cpu_to_core_group(int cpu, const cpumask_t *cpu_map, struct sched_group **sg,
                  cpumask_t *mask)
{
        int group;

        *mask = per_cpu(cpu_sibling_map, cpu);

this copies 0x200 bytes (= 4096 bits), compiled my environment as follows;
ffffffff80251c56 <cpu_to_core_group>:
cpu_to_core_group():
ffffffff80251c56:       55                      push   %rbp
ffffffff80251c57:       48 63 ff                movslq %edi,%rdi
ffffffff80251c5a:       48 89 e5                mov    %rsp,%rbp
ffffffff80251c5d:       41 55                   push   %r13
ffffffff80251c5f:       49 89 d5                mov    %rdx,%r13
ffffffff80251c62:       ba 00 02 00 00          mov    $0x200,%edx
ffffffff80251c67:       41 54                   push   %r12
ffffffff80251c69:       49 89 f4                mov    %rsi,%r12
ffffffff80251c6c:       48 c7 c6 00 c1 c8 81    mov    $0xffffffff81c8c100,%rsi
ffffffff80251c73:       53                      push   %rbx
ffffffff80251c74:       48 89 cb                mov    %rcx,%rbx
ffffffff80251c77:       48 83 ec 08             sub    $0x8,%rsp
ffffffff80251c7b:       48 8b 05 3e d0 98 01    mov    0x198d03e(%rip),%rax        # ffffffff81bdecc0 <_cpu_pda>
ffffffff80251c82:       48 8b 04 f8             mov    (%rax,%rdi,8),%rax
ffffffff80251c86:       48 89 cf                mov    %rcx,%rdi
ffffffff80251c89:       48 03 70 08             add    0x8(%rax),%rsi
ffffffff80251c8d:       e8 de 29 25 00          callq  ffffffff804a4670 <__memcpy>

the 3rd parameter of __memcpy is rdx = 0x200.

So, I guess, we need
kmalloc(BITS_TO_LONGS(NR_CPUS), flags)
at alloc_cpumask_var().

Or change cpumask handling in sched.c etc?
I've no idea for this more, now.

thanks,
Hiroshi Shimamoto
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/