linux-kernel - Re: atomic RAM ?

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20100408114542.47b6589a@lxorguk.ukuu.org.uk>
Date:	Thu, 8 Apr 2010 11:45:42 +0100
From:	Alan Cox <alan@...rguk.ukuu.org.uk>
To:	Michael Schnell <mschnell@...ino.de>
Cc:	linux-kernel@...r.kernel.org,
	nios2-dev <nios2-dev@...c.et.ntust.edu.tw>
Subject: Re: atomic RAM ?

>  - no normal processor "read-modify-write" instructions that by design
> are not interrupted or even bus-locking
>  - new "custom" processor instructions can be defined that might work
> atomically, as well not interruptible as kind of bus-locking for SMP use
>  - these custom instructions can't access the memory in normal way
> (through the MMU and the cache).
> 
> So to implement atomic instructions a dedicated RAM area would be needed
> to hold the atomically accessible data. Same can't be accessed by the
> CPU in other ways. This RAM area would be implemented with the new
> atomic instructions and be located within the FPGA and thus could
> accessed very fast (no cache issues).

Take a look at sparc 32bit. That only has a single meaningful atomic
instruction (swap byte with 0xFF). It provides all the kernel atomic_t
operations via this: arch/sparc/lib/atomic32.c. That bitops are done a
similar way, which leaves spinlocks and the like.

More importantly if your true locks in the FPGA are really fast in CPU
terms then you can think of every other atomic instructions as being
implemented using

		lock(cpu_atomic_instruction_lock)
		do bits
		unlock(cpu_atomic_instruction_lock)

(its just this is normally done in hardware/microcode)

Doing it per instruction might be a bit naïve but I think you can
reasonably do it so that things like spinlocks use a single (or a hashed
set) of non kernel locks to implement "atomic" instructions, and as
sparc32 shows you only need a tiny subset of them to implement the rest
in their terms.

So I don't actually think you need any kernel core changes to get going,
and given the kernel dynamically allocates a lot of locks I suspect
trying to dynamically manage atomic ram allocations is going to cost more
than executing a few instructions here and there under a single very fast
hardware assisted lock.

Alan
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/