lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Tue, 3 Mar 2015 19:45:27 +0100
From:	Borislav Petkov <bp@...e.de>
To:	Andre Przywara <andre.przywara@....com>,
	Sudeep Holla <sudeep.holla@....com>, Tejun Heo <tj@...nel.org>
Cc:	linux-kernel@...r.kernel.org, Thomas Gleixner <tglx@...utronix.de>,
	Ingo Molnar <mingo@...hat.com>,
	"H. Peter Anvin" <hpa@...or.com>, x86@...nel.org
Subject: Re: [PATCH RFT v2] x86: move cacheinfo sysfs to generic cacheinfo
 infrastructure

@Tejun: scroll to the end :)

On Sun, Mar 01, 2015 at 10:39:24PM +0000, Andre Przywara wrote:
> So I gave that a spin on my old PhenomII X6 box, and indeed there is a
> NULL pointer dereference. I tracked it down to
> __cache_amd_cpumap_setup(), where the sibling map for the L3 is
> populated. When the function is called for the first CPU, subsequent
> CPU's cacheinfo structures obviously haven't been initialized yet, so

Right, let me try to understand the splat:

[    3.030598] RIP: 0010:[<ffffffff810141c0>]  [<ffffffff810141c0>] _populate_cache_leaves+0x440/0x460

This is:

     9d6:       49 0f a3 06             bt     %rax,(%r14)
     9da:       19 c9                   sbb    %ecx,%ecx
     9dc:       85 c9                   test   %ecx,%ecx
     9de:       74 d0                   je     9b0 <_populate_cache_leaves+0x410>
     9e0:       f0 49 0f ab 84 24 f8    lock bts %rax,0xf8(%r12)			<----
     9e7:       00 00 00 
     9ea:       eb c4                   jmp    9b0 <_populate_cache_leaves+0x410>
     9ec:       4c 8b 65 98             mov    -0x68(%rbp),%r12
     9f0:       e9 eb fd ff ff          jmpq   7e0 <_populate_cache_leaves+0x240>
     9f5:       66 66 2e 0f 1f 84 00    data16 nopw %cs:0x0(%rax,%rax,1)

%r12 is 0 and that is this hunk in __cache_amd_cpumap_setup():

        } else if (index == 3) {
                for_each_cpu(i, cpu_llc_shared_mask(cpu)) {
                        this_cpu_ci = get_cpu_cacheinfo(i);
                        this_leaf = this_cpu_ci->info_list + index;
                        for_each_cpu(sibling, cpu_llc_shared_mask(cpu)) {
                                if (!cpu_online(sibling))
                                        continue;
                                cpumask_set_cpu(sibling,			<--- set_bit() does LOCK
                                                &this_leaf->shared_cpu_map);
                        }
                }

so this_leaf is NULL.

We get it by doing

	this_leaf = this_cpu_ci->info_list + index;

and this_cpu_ci is a per-CPU variable.

Now, previously the code did

-			if (!per_cpu(ici_cpuid4_info, i))
-				continue;


and __cache_cpumap_setup() already does:

                        if (i == cpu || !sib_cpu_ci->info_list)
                                continue;/* skip if itself or no cacheinfo */

so maybe we should do that too in __cache_amd_cpumap_setup():

                        if (!this_cpu_ci->info_list)
                                continue;

for the index == 3 case?

It boots fine here with that change and it is consistent with the
previous code.

And yes, the x86 cacheinfo code could use a serious rubbing and cleanup.

Btw, this patch introduces a bunch of new sysfs nodes in the caches
hierarchy:

--- caches-guest-before.txt     2015-03-03 15:11:09.168276423 +0100
+++ caches-guest-after.txt      2015-03-03 18:19:04.084426130 +0100
@@ -1,6 +1,22 @@
+/sys/devices/system/cpu/cpu0/cache/power/control:1:auto
+/sys/devices/system/cpu/cpu0/cache/power/async:1:disabled
+/sys/devices/system/cpu/cpu0/cache/power/runtime_enabled:1:disabled
+/sys/devices/system/cpu/cpu0/cache/power/runtime_active_kids:1:0
+/sys/devices/system/cpu/cpu0/cache/power/runtime_active_time:1:0
+/sys/devices/system/cpu/cpu0/cache/power/runtime_status:1:unsupported
+/sys/devices/system/cpu/cpu0/cache/power/runtime_usage:1:0
+/sys/devices/system/cpu/cpu0/cache/power/runtime_suspended_time:1:0
 /sys/devices/system/cpu/cpu0/cache/index0/size:1:64K
 /sys/devices/system/cpu/cpu0/cache/index0/type:1:Data
 /sys/devices/system/cpu/cpu0/cache/index0/level:1:1
+/sys/devices/system/cpu/cpu0/cache/index0/power/control:1:auto
+/sys/devices/system/cpu/cpu0/cache/index0/power/async:1:disabled
+/sys/devices/system/cpu/cpu0/cache/index0/power/runtime_enabled:1:disabled
+/sys/devices/system/cpu/cpu0/cache/index0/power/runtime_active_kids:1:0
+/sys/devices/system/cpu/cpu0/cache/index0/power/runtime_active_time:1:0
+/sys/devices/system/cpu/cpu0/cache/index0/power/runtime_status:1:unsupported
+/sys/devices/system/cpu/cpu0/cache/index0/power/runtime_usage:1:0
+/sys/devices/system/cpu/cpu0/cache/index0/power/runtime_suspended_time:1:0
 /sys/devices/system/cpu/cpu0/cache/index0/number_of_sets:1:512
 /sys/devices/system/cpu/cpu0/cache/index0/shared_cpu_map:1:1
 /sys/devices/system/cpu/cpu0/cache/index0/shared_cpu_list:1:0
 ...

What do those things mean? runtime_active_kids ?? Kids are active during
runtime?! Well, that's a given, no need for a sysfs node for that :-)

> > Indeed I see a regression between 3.19.0 and a clean 4.0.0-rc1 in this
> > respect, the diff is like:
> > -cpu5/cache/index3/shared_cpu_map:00000000,0000003f
> > +cpu5/cache/index3/shared_cpu_map:3f
> > (for each cpu and index)
> >
> > Can someone confirm (and investigate) this?
>
> I saw this even on ARM64 platforms with 4.0-rc1 and did narrow down to
> series of formatting functions changes introduced by Tejun Heo, mainly
> commit 4a0792b0e7a6("bitmap: use %*pb[l] to print bitmaps including
> cpumasks and nodemasks")

Strictly speaking, this is a regression if userspace is parsing those
masks. I hardly doubt anything is doing that but if we had to be
completely strict we should add back the zeroes.

Tejun, we have this change in user-visible masks formatting in sysfs
after your bitmaps printing changes:

-cpu5/cache/index3/shared_cpu_map:00000000,0000003f
+cpu5/cache/index3/shared_cpu_map:3f

What's the rationale on userspace parsing bitmaps in sysfs, we don't
care?

Thanks.

-- 
Regards/Gruss,
    Boris.

ECO tip #101: Trim your mails when you reply.
--
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ