[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20031008124717.A1816@ring.CS.Berkeley.EDU>
From: nweaver at CS.berkeley.edu (Nicholas Weaver)
Subject: Re: [PAPER] Juggling with packets: floating data storage
On Wed, Oct 08, 2003 at 08:18:12PM +0200, Michal Zalewski composed:
> > The higher level juggling is still pointless. Lets assume I still have
> > the 1 Gb link to the outside world, and the data round trip latency is a
> > minute (the amount of time data can be stored externally before it comes
> > back to me). Thats still just 60 Gb of data, or 7.5 GB. And I'm having
> > to burn a 1 Gb link to do it!
>
> Should you actually read the paper, it would be easier to comprehend the
> idea. The paper proposes several approaches that do not require you to
> send the data back and forth all the time; in some cases, you can achieve
> near-infinite latency with a very low "sustaining traffic" necessary to
> prevent the data from being discarded. Besides, the storage size is not
> the main point, the entire approach is much more interesting from the
> limited deniability high-privacy storage point of view.
You have some refresh time required, just like DRAM or any other
lossy/decaying medium. Without the refresh, you will have
uncorrectible data decay. Its simply the refresh bandwidth and
refresh rate.
So lets assume a 1 DAY refresh time, and a 1 Gb refresh bandwidth.
That's still a maximum of ~10 TB of storage (3600 * 24 / 8), and you
are going to have to be saturating the link to maintain it.
With a 100 Mb refresh bandwidth, thats 1 TB. I can go out and buy a
box which holds nearly 1 TB for <$2000. Heck, I can buy a >2 TB RAID
array, FROM APPLE (hardly a low price vendor) for $11k!
In terms of serving data, CDN's reportedly charge around $20/GB served
or so, or you can always construct a BitTorrent-like network to use
the edge-hosts, which is what Valve wants to do for selling/patching
HalfLife.
Even given a low/moderate-user (~100 user) network, the data cost is
going to be pretty extreem to maintain refresh, and as a result,
rather/very unstealthy. Lets assume 1 TB shared between 1k users, and
a 1 hour refresh time (probably still way high).
Thats still 200 Kb/s/user continuous bandwidth. I don't know about
you, but I can't get that reliably on my cable modem upload, and if I
DID use that much bandwidth, ALL THE TIME, the cable company would
probably come knocking.
Why not just build a distributed filesystem for those 1K users?
The only real advantage is a monocrum of stealth, but it isn't really
stealthy if you are storing a non-trivial amount of data: the network
traffic is "Storage volume / refresh time", and the refresh time is
going to have to be fairly short to have a monocrum of reliability.
Burning even just 10Kb/s/user isn't going to be "stealthy" in
practice, due to the continual load.
For storing a small quantity of data (eg, ~1-2 GB or less), there are
much better covert repositories one could consider.
> The second observation is that not all the data has to be send back and
> forth all the time. Interesting.
The refresh rate is required for parasitic storages, to prevent data
decay and to recompute checksums etc to handle data loss.
Even given a 1 WEEK refresh cycle, a 100 Mb continual refresh
bandwidth only gets you 7 TB of storage, 1 Gb aggregate bandwidth gets
70 TB.
--
Nicholas C. Weaver nweaver@...berkeley.edu
Powered by blists - more mailing lists