linux-kernel - Re: [ANNOUNCE] Ramback: faster than a speeding bullet

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <alpine.DEB.1.00.0803151540150.22040@asgard.lang.hm>
Date:	Sat, 15 Mar 2008 16:22:20 -0700 (PDT)
From:	david@...g.hm
To:	Daniel Phillips <phillips@...nq.net>
cc:	Willy Tarreau <w@....eu>, Alan Cox <alan@...rguk.ukuu.org.uk>,
	David Newall <davidn@...idnewall.com>,
	linux-kernel@...r.kernel.org
Subject: Re: [ANNOUNCE] Ramback: faster than a speeding bullet

On Sat, 15 Mar 2008, Daniel Phillips wrote:

> On Saturday 15 March 2008 14:54, Willy Tarreau wrote:
>> I think it could get major adoption with ordered writes.
>
> It already has ordered write when it is in flush mode.
>
> OK, I hear you. There will be an ordered write mode that uses barriers
> to decide the ordering.  It will greatly reduce the speed at which
> ramback can flush dirty data because of the need to wait synchronously
> on every barrier, of which there are many.  And thus will widen out the
> window during which UPS power must remain available if power goes out,
> in order to get all acknowledged transactions on to stable media.  The
> advantage is, the stable media always has a point-in-time version of
> the filesystem.

it will mean that the window is larger, but it will also mean that if 
something else goes wrong and that window is not available the data that 
was written out will be useable (recent data will be lost, but older data 
will still be available)

as for things that can go wrong

the UPS battery can go bad
you can have multiple power failures in a short time so your battery is not fully charged
capacitors in the UPS can go bad
capacitors in the power supply can go bad
capacitors on the motherboard can go bad
a kernel bug can crash the system
a bug in a device driver (say nvidia graphics driver) can crash the system
a card in the system can lock up the system bus
the system power supply can die
the system fans can die and cause the system to overheat
cooling in the room the system is in can fail and cause the system to overheat
airflow to the computer can get blocked and cause the system to overheat
some other component in the computer can short out and cause the system to loose power internally

I have had every single one of these things happen to me over the years. 
Some on personal equipment, some on work equipment. At work I recently had 
a series of disasters where capacitors in a 7 figure UPS blew up, and a 
few days later during a power outage when we were running on generator, a 
fuel company made a mistake while adding fuel to the generator and knocked 
it out.

Even if you spend millions on equipment and professionals to set it up and 
maintain it, you can still go down.

You may not care about it on your system (becouse you copy data elsewhere 
and don't change it rapidly), but most people do. with your current 
approach you are slightly better then a couple shell scripts from an 
availability point of view, you are no better in performance, but your 
failure mode is complete disaster.

comparing you to 'cp drive ramdisk' at startup and 'rsync ramdisk drive' 
periodicly and at shutdown you are faster at startup, close enough at 
shutdown as to be in the noise (either one could be faster, depending on 
the exact conditions)

you have a failback mode that when the UPS tells you it has failed you 
switch to write-through mode, that's some use (but only if you get 
everything flushed first)

another off-the-shelf option is that you could use DRDB between the 
ramdisk and the real drive, and when you loose power reconfigure to do 
syncronous updates instead of write-behind updates. that would still be 
far safer then ramback in it's current mode.

> Don't expect this mode in the immediate future though, there are bugs
> to fix in the current driver, which already implements the required
> performance and stability requirements for a broad range of users.

and when those users ask why this functionality isn't in the kernel they 
will read this thread and learn how many risks they are taking (in spite 
of you promising them that they are perfectly safe)

anyone who has run any significant number of systems will not believe your 
statement that hardware and software is reliable enough to be trusted like 
this. by continuing to make this claim you are going to be ignored by 
those people, and franky, they will distrust any of your work as a result.

>>> That is why I keep recommending that a ramback setup be replicated or
>>> mirrored, which people in this thread keep glossing over.  When
>>> replicated or mirrored, you still get the microsecond-level transaction
>>> times, and you get the safety too.

but a straight ramdisk can be replicated or mirrored. there's no need to 
have ramback to do this.

>> I agree, but in this case, you should present it this way. You have been
>> insisting too much on the average PC's reliability, the fact that no kernel
>> ever crashed for you, etc... So you are demonstrating that your product is
>> good provided that everything goes perfectly. All people who have experienced
>> software or hardware problems in the past (ie mostly everyone here) will not
>> trust your code because it relies on pre-requisites they know they do not
>> have.
>
> That would have been a miscommunication then.  I see arguments coming
> in that suggest embedded solutions, EMC for example, are inherently more
> reliable than a Linux based solution.  Well guess what?  Some of those
> embedded solutions already use Linux.

they aren't arguing that the embedded solutions are more safe becouse they 
don't use linux. they are arguing that they are more safe becouse they 
have different enginnering then normal machines, and it's that engineering 
that makes them safer, not the software.

the reason why battery backed ram on a raid card is safer than a UPS on a 
general purpose machine is becouse the battery backed ram is static ram, 
while the ram in your system is dynamid ram. static ram only needs power 
to retain it's memory, dynamic ram needs a preocessor running to access 
the ram continuously to refresh it.

you see 'battery+ram' in both cases and argue that they are equally safe. 
that just isn't the case.

the raid card can be pulled from one machine and put into another, in some 
cases the ram can be pulled from one card and plugged into another. it can 
sit on a shelf unplugged form anything but the battery for several days. 
this means that unless something physicaly damages the ram and enough 
drives to fail the raid array, the data is safe.

EMC, Netapp, and the other enterprise vendors have special purpose 
hardware to implement this safety. how much special hardware they have 
varies by company and equipment, but they all have some.

David Lang
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/