lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Tue, 8 Sep 2009 15:09:43 +0200
From:	Ralf Baechle <ralf@...ux-mips.org>
To:	Benjamin Herrenschmidt <benh@...nel.crashing.org>
Cc:	Ingo Molnar <mingo@...e.hu>, Michael Buesch <mb@...sch.de>,
	Con Kolivas <kernel@...ivas.org>, linux-kernel@...r.kernel.org,
	Peter Zijlstra <a.p.zijlstra@...llo.nl>,
	Mike Galbraith <efault@....de>, Felix Fietkau <nbd@...nwrt.org>
Subject: Re: BFS vs. mainline scheduler benchmarks and measurements

On Tue, Sep 08, 2009 at 07:50:00PM +1000, Benjamin Herrenschmidt wrote:

> On Tue, 2009-09-08 at 09:48 +0200, Ingo Molnar wrote:
> > So either your MIPS system has some unexpected dependency on the 
> > scheduler, or there's something weird going on.
> > 
> > Mind poking on this one to figure out whether it's all repeatable 
> > and why that slowdown happens? Multiple attempts to reproduce it 
> > failed here for me.
> 
> Could it be the scheduler using constructs that don't do well on MIPS ? 

It would surprise me.

I'm wondering if BFS has properties that make it perform better on a very
low memory system; I guess the BCM74xx system will have like 32MB or 64MB
only.

> I remember at some stage we spotted an expensive multiply in there,
> maybe there's something similar, or some unaligned or non-cache friendly
> vs. the MIPS cache line size data structure, that sort of thing ...
> 
> Is this a SW loaded TLB ? Does it misses on kernel space ? That could
> also be some differences in how many pages are touched by each scheduler
> causing more TLB pressure. This will be mostly invisible on x86.

Software refilled.  No misses ever for kernel space or low-mem; think of
it as low-mem and kernel executable living in a 512MB page that is mapped
by a mechanism outside the TLB.  Vmalloc ranges are TLB mapped.  Ioremap
address ranges only if above physical address 512MB.

An emulated unaligned load/store is very expensive; one that is encoded
properly by GCC for __attribute__((packed)) is only 1 cycle and 1
instruction ( = 4 bytes) extra.

> At this stage, it will be hard to tell without some profile data I
> suppose. Maybe next week I can try on a small SW loaded TLB embedded PPC
> see if I can reproduce some of that, but no promises here.

  Ralf
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ