[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20100212150137.648dca7c@hyperion.delvare>
Date: Fri, 12 Feb 2010 15:01:37 +0100
From: Jean Delvare <khali@...ux-fr.org>
To: "J.H." <warthog9@...nel.org>
Cc: linux-kernel <linux-kernel@...r.kernel.org>, mirrors@...nel.org,
users@...nel.org, "FTPAdmin Kernel.org" <ftpadmin@...nel.org>,
lasse.collin@...aani.org
Subject: Re: [kernel.org users] XZ Migration discussion
On Thu, 11 Feb 2010 10:36:03 -0800, J.H. wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> Hey Everyone,
>
> So as the subject states this is more a centralized discussion on
> migration plans to using and providing xz for content on kernel.org.
> Currently we provide gz and bz2, with gz acting as the original content
> and kernel.org itself generating the resulting bz2 files. There are a
> couple of possible proposals and wanted to toss them out there, and get
> feedback from everyone: the kernel community, the mirrors of kernel.org
> and the direct users of kernel.org.
Don't you have download statistics available? If we knew which
compression format is preferred, an by which margin, it would help make
an educated decision.
> ========================================================================
>
> Option 1)
>
> Leave gz as the master, and migrate bz2 to xz. This will happen in
> stages obviously. with bz2 ultimately being phased out.
>
> Migration option 1)
>
> All new content would be provided in .bz2 and .xz with
> an ultimate date set that the .bz2 files would stop
> being generated with new content. This would leave all
> existing content alone and it would not be a migration
> of the current .bz2 files to xz
>
> Migration option 2)
>
> At some point there would be a mass conversion of all
> existing content to include .bz2 and .xz. These would
> be run in parallel for a time period until it was
> determined that .bz2 was no longer needed and it would
> be removed from the servers leaving .gz and .xz
>
> Option 2)
>
> Convert the master data from gz to bz2 and use xz as the new file
> format. This has the downside of causing more tool churn as it means
> the kernel developers will have to eventually convert from gz to bz2,
> which means for a time there will be nag e-mails if you upload gz
> instead of bz2 and such. It would also mean that we (kernel.org) would
> need to be able to support .gz and .bz2 as master data for a time.
>
> Migration options are identical to Option 1 more or less, with either
> just new content getting converted, or all content getting converted.
>
> ========================================================================
>
> I'm personally leaning towards option 1, though personally don't really
> have a preference on the migration options, as both obviously offer
> different advantages, and again this e-mail is more to spur on the
> discussion and come to some general consensus across all of the groups
> concerned before moving forward with a more specific plan.
>
> So I'm inviting discussion, questions and comments on this so we know
> which way to ultimately go.
Maybe that's just me, but my main concern is neither download times nor
decompression times. My main concern is the access time to directory
indexes when browsing the kernel archive, because there are 5 entries
for every patch or tarball: .bz2, .bz2.sign, .gz, .gz.sign and .sign.
This is horribly slow. The main directory for 2.6 kernels has an index
weighting over 300 kB raw, turning into a ~600 kB document when
HTML-ized. Just fetching it takes 3 seconds and then my browser takes a
long time to format it. There are 3881 entries in that directory today,
and it keeps growing!
So, once we have settled for a compression strategy, I think it would
be the right time to discuss the directory structure. With the advent
of the stable branches and the new development model - which pretty
much implies that we'll live with main version 2.6 forever - the file
count is much higher than it used to me.
I can think of several ways to improve the situation here, some of
which could be combined.
1* Keep a single compression format. This saves almost 40% of the
files.
2* Move one of the compression formats somewhere else, so that it
doesn't get in the way but is still available if needed.
3* Create a new subdirectory for every 2.6.x kernel, and move all the
related files there. This would shrink the main index drastically, and
each subdirectory would have a reasonable size (except maybe 2.6.16 and
2.6.27.) Oddly enough this has been done for the files under testing/
already, so I am curious why we don't do it for the release files (and
the testing/incr/ files, while we're at it.)
4* Get rid of the LATEST-IS-* files. This is a small count, won't save
much, but these files seem totally useless to me these days. Depending
on what you want exactly, there are many versions which can be
considered the latest, and there are better ways to know which they are
(for example http://www.eu.kernel.org/kdist/finger_banner ). And these
files tend to get stuck so you can't rely on them anyway.
I wouldn't worry too much about breaking the current locations. Just
give some time for software authors (ketchup comes to mind) to update
their code and it shouldn't be a big problem.
Thanks,
--
Jean Delvare
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists