lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1549898730.2831.6.camel@HansenPartnership.com>
Date:   Mon, 11 Feb 2019 07:25:30 -0800
From:   James Bottomley <James.Bottomley@...senPartnership.com>
To:     Jens Axboe <axboe@...nel.dk>,
        Mikael Pettersson <mikpelinux@...il.com>
Cc:     Linux SPARC Kernel Mailing List <sparclinux@...r.kernel.org>,
        linux-block@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [5.0-rc5 regression] "scsi: kill off the legacy IO path" causes
 5 minute delay during boot on Sun Blade 2500

On Sun, 2019-02-10 at 09:35 -0700, Jens Axboe wrote:
> On 2/10/19 9:25 AM, James Bottomley wrote:
> > On Sun, 2019-02-10 at 09:05 -0700, Jens Axboe wrote:
> > > On 2/10/19 8:44 AM, James Bottomley wrote:
> > > > On Sun, 2019-02-10 at 10:17 +0100, Mikael Pettersson wrote:
> > > > > On Sat, Feb 9, 2019 at 7:19 PM James Bottomley
> > > > > <James.Bottomley@...senpartnership.com> wrote:
> > > > 
> > > > [...]
> > > > > > I think the reason for this is that the block mq path
> > > > > > doesn't feed the kernel entropy pool correctly, hence the
> > > > > > need to install an entropy gatherer for systems that don't
> > > > > > have other good random number sources.
> > > > > 
> > > > > That does sound plausible, I admit I didn't even consider the
> > > > > possibility that the old block I/O path also was an entropy
> > > > > source.
> > > > 
> > > > In theory, the new one should be as well since the rotational
> > > > entropy collector is on the SCSI completion path.   I'd seen
> > > > the same problem but had assumed it was something someone had
> > > > done to our internal entropy pool and thus hadn't bisected it.
> > > 
> > > The difference is that the old stack included ADD_RANDOM by
> > > default, so this check:
> > > 
> > > 	if (blk_queue_add_random(q))
> > > 		add_disk_randomness(req->rq_disk);
> > > 
> > > in scsi_end_request() would be true, and we'd add the randomness.
> > > For sd, it seems to set it just fine for non-rotational drives.
> > > Could this be because other devices don't? Maybe the below makes
> > > a difference.
> > 
> > No, in both we set it per the rotational parameters of the disk in 
> > 
> > sd.c:sd_read_block_characteristics()
> > 
> > 	rot = get_unaligned_be16(&buffer[4]);
> > 
> > 	if (rot == 1) {
> > 	
> > 	blk_queue_flag_set(QUEUE_FLAG_NONROT, q);
> > 	
> > 	blk_queue_flag_clear(QUEUE_FLAG_ADD_RANDOM, q);
> > 	} else {
> > 	
> > 	blk_queue_flag_clear(QUEUE_FLAG_NONROT, q);
> > 	
> > 	blk_queue_flag_set(QUEUE_FLAG_ADD_RANDOM, q);
> > 	}
> > 
> > 
> > That check wasn't changed by the code removal.
> 
> As I said above, for sd. This isn't true for non-disks.

Yes, but the behaviour above doesn't change across a switch to MQ, so I
don't quite understand how it bisects back to that change.  If we're
not gathering entropy for the device now, we wouldn't have been before
the switch, so the entropy characteristics shouldn't have changed.

> > Although I suspect it should be unconditional: even SSDs have what
> > would appear as seek latencies at least during writes depending on
> > the time taken to find an erased block or even trigger garbage
> > collection.  The entropy collector is good at taking something
> > completely regular and spotting the inconsistencies, so it won't
> > matter that loads of "seeks" are deterministic.
> 
> The reason it isn't is that it's of limited use for SSDs where it's a
> lot more predictable. And they are also a lot faster, which means the
> adding randomness is more problematic from an efficiency pov.

But that's my point: our entropy extractor is good at weeding out
predictable signals.  Fine, it won't extract any entropy if the disk
seek time is entirely regular, but it won't contaminate the entropy
pool.  The computational delay, I grant ... it takes a while to
determine if any entropy is present in the signal.

What about feeding it with something like discard timings, which should
be much less predictable.

James

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ