lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20100408114542.47b6589a@lxorguk.ukuu.org.uk>
Date:	Thu, 8 Apr 2010 11:45:42 +0100
From:	Alan Cox <alan@...rguk.ukuu.org.uk>
To:	Michael Schnell <mschnell@...ino.de>
Cc:	linux-kernel@...r.kernel.org,
	nios2-dev <nios2-dev@...c.et.ntust.edu.tw>
Subject: Re: atomic RAM ?

>  - no normal processor "read-modify-write" instructions that by design
> are not interrupted or even bus-locking
>  - new "custom" processor instructions can be defined that might work
> atomically, as well not interruptible as kind of bus-locking for SMP use
>  - these custom instructions can't access the memory in normal way
> (through the MMU and the cache).
> 
> So to implement atomic instructions a dedicated RAM area would be needed
> to hold the atomically accessible data. Same can't be accessed by the
> CPU in other ways. This RAM area would be implemented with the new
> atomic instructions and be located within the FPGA and thus could
> accessed very fast (no cache issues).

Take a look at sparc 32bit. That only has a single meaningful atomic
instruction (swap byte with 0xFF). It provides all the kernel atomic_t
operations via this: arch/sparc/lib/atomic32.c. That bitops are done a
similar way, which leaves spinlocks and the like.

More importantly if your true locks in the FPGA are really fast in CPU
terms then you can think of every other atomic instructions as being
implemented using

		lock(cpu_atomic_instruction_lock)
		do bits
		unlock(cpu_atomic_instruction_lock)

(its just this is normally done in hardware/microcode)

Doing it per instruction might be a bit naïve but I think you can
reasonably do it so that things like spinlocks use a single (or a hashed
set) of non kernel locks to implement "atomic" instructions, and as
sparc32 shows you only need a tiny subset of them to implement the rest
in their terms.

So I don't actually think you need any kernel core changes to get going,
and given the kernel dynamically allocates a lot of locks I suspect
trying to dynamically manage atomic ram allocations is going to cost more
than executing a few instructions here and there under a single very fast
hardware assisted lock.

Alan
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ