[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <yq14kxygv4h.fsf@oracle.com>
Date: Tue, 17 Dec 2019 22:39:26 -0500
From: "Martin K. Petersen" <martin.petersen@...cle.com>
To: Guenter Roeck <linux@...ck-us.net>
Cc: "Martin K. Petersen" <martin.petersen@...cle.com>,
Linus Walleij <linus.walleij@...aro.org>,
linux-hwmon@...r.kernel.org, Jean Delvare <jdelvare@...e.com>,
Linux Doc Mailing List <linux-doc@...r.kernel.org>,
"linux-kernel\@vger.kernel.org" <linux-kernel@...r.kernel.org>,
linux-scsi@...r.kernel.org, linux-ide@...r.kernel.org,
Chris Healy <cphealy@...il.com>
Subject: Re: [PATCH 1/1] hwmon: Driver for temperature sensors on SATA drives
Guenter,
> If there are 100 physical drives, you would actually want to see the
> temperature of each drive separately, as one of them might be
> overheating due to some internal failure.
Yep. However, for "big boxes" you'll typically get that information from
SAF-TE or SES enclosure services and not from the drive itself.
SES allows you to monitor power supplies, drive bays, hot swap events,
thermals, etc. We have a SES driver in SCSI that exposes all these
things in sysfs. It is not currently tied into hwmon.
> If the storage array is represented to the system as single huge
> physical drive, which is then split into logical entities not related
> to physical drives, I guess that would represent a problem for system
> management overall.
Yep. That's why there's dedicated plumbing in smartmontools to handle
various RAID controller interfaces for accessing physical drive
information. It's typically highly vendor-specific.
> I would not mind to tie the hardware monitoring device to something
> else than the scsi device if the scsi device does not always have a
> physical representation. Is there a way to determine if a scsi device
> is virtual or real ?
Not really. Target is usually a pretty good approximation, although some
arrays introduce virtual targets because of limited LUN (scsi_device)
numbering capabilities. However, arrays generally don't support per-LUN
temperature because it makes no sense.
I'm trying to gauge how much a pain potentially redundant sensors would
be for userland monitoring tooling vs. how many oddball devices we'd not
be able to support if we were to use scsi_target as parent (or restrict
the sensor binding to LUN 0).
--
Martin K. Petersen Oracle Linux Engineering
Powered by blists - more mailing lists