lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20081216032405.GB19686@leela.home.grueneberg.de>
Date:	Tue, 16 Dec 2008 04:24:06 +0100
From:	Andre Grueneberg <list.lguest@...eneberg.de>
To:	Rusty Russell <rusty@...tcorp.com.au>
Cc:	lguest@...abs.org, linux-kernel@...r.kernel.org
Subject: Re: [Lguest] lguest, virtio_blk, id is not a head!

Hi Rusty,

Rusty Russell wrote:
> > Lately I upgraded my host and guest from kernel 2.6.25 (Debian) to
> > vanilla 2.6.27.8. I have two virtual machines and in both I recognised
> > the same issue after a while.
> > 
> > After running for some hours the first guest showed the following
> > message on the console:
> > virtio_blk virtio1: id 1 is not a head!
>    This is interesting.  There hasn't been a great deal of change between
> those versions on the lguest side.  This sounds like a race, and the only
> thing I can see which would have introduced this is the change below.

I removed the "lguest: turn Waker into a thread, not a process" patch as
proposed. Short result: The problem still exists.

Longer story:
As this issue is extremly hard to reproduce, I made an attempt to cause
many file system reads. If I understood things correctly this is when
the guest side of virtio_blk is likely to call vring_get_buf?! I did so
by creating a large file (300MiB vs. 256MiB memory) in the guest and
reading it multiple times in 1k pieces (dd).

To verify that it works, I ran this loop on 2.6.27.8 guest with the
original launcher: failure after ~200 iterations.

So I went on with the patched version of lguest (launcher with waker
process) and the same 2.6.27.8 guest: failure after ~600 iterations.

Next step was a 2.6.25 guest with the original launcher: I stopped it
after ~1600 iterations did not show a fault.

As it occured to me that using a file in a guest's file system might not
be the optimal target, I have now modified my loop to use the first
300MiB of the block device itself. With the original combination, I now
got the failure after ~750 iterations.

I am now running this test with 2.6.25 guest again. As the host is also
serving some serious purpose, I can only run this kind of long lasting
tests during night time. So I'll be forced to stop this test soon (but
I'll move into bed first).

My next step will be to use some non-root block device, to have a chance
to access the guest after the crash. Do you have any idea what to look
at then?

One thing, I just recognised:
While my 2.6.25 guest kernel is configured with "NOHz mode", the 
2.6.27.8 guest is using "high resolution mode".

Thanks so far,
André
-- 
Reality is a crutch for people who can't handle drugs.

Download attachment "signature.asc" of type "application/pgp-signature" (198 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ