lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Tue, 18 Dec 2018 21:26:27 +0100
From:   Jasper Spaans <j@...per.es>
To:     Joey Pabalinas <joeypabalinas@...il.com>,
        Joe Perches <joe@...ches.com>,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>
Subject: Re: [RFC] LKML Archive in Maildir Format

Hi Joey,

On Sun, Dec 16, 2018 at 09:21:35AM -1000, Joey Pabalinas wrote:
> > > I spent a lot of time trying to find an LKML archive in Maildir format
> > > that I could use for local searches with nutmuch or something, but all
> > > the links I was able to find were all dead.
> > 
> > You might instead use
> > 
> > https://www.kernel.org/lore.html
> > https://git.kernel.org/pub/scm/public-inbox/vger.kernel.org/git.git/
> 
> That was my first attempt, but the ducumentation for the public-inbox
> format is sort of terrible, and after a few hours trying to convert it
> to Maildir I just gave up.
> 
> I ended up just slowly scraping lkml.org for a couple weeks so I
> wouldn't disrupt anything and it worked fairly well. Just looking for
> advice on where to host this now so others might be able to use it.

Now you've caught my attention; first of all, there are more than 3M
messages stored in the lkml.org datase, so I guess you've missed some
messages or something is really broken.

Besides, unless you figured out how to get to the raw data, you've just
scraped a rendering which discards stuff like pgp signatures etc and has
very incomplete headers. Unless you don't care for those of course :)

Note that I've also been toying with the lore dataset, and wrote a tiny tool
to get Maildir-like data out of it; this code is a bit of a single-use-jig
so you'll need to do some coding if you really want to use it.  Attached
anyway.

All the best and enjoy,
Jasper

View attachment "Pipfile" of type "text/plain" (168 bytes)

View attachment "test.py" of type "text/x-python" (1083 bytes)

Download attachment "signature.asc" of type "application/pgp-signature" (1529 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ