[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <51F0AFAE.9080802@jp.fujitsu.com>
Date: Thu, 25 Jul 2013 13:55:10 +0900
From: Yasuaki Ishimatsu <isimatu.yasuaki@...fujitsu.com>
To: Hush Bensen <hush.bensen@...il.com>
CC: Toshi Kani <toshi.kani@...com>, Ingo Molnar <mingo@...nel.org>,
<akpm@...ux-foundation.org>, <linux-mm@...ck.org>,
<linux-kernel@...r.kernel.org>, <x86@...nel.org>, <dave@...1.net>,
<kosaki.motohiro@...il.com>, <tangchen@...fujitsu.com>,
<vasilis.liaskovitis@...fitbricks.com>
Subject: Re: [PATCH v2] mm/hotplug, x86: Disable ARCH_MEMORY_PROBE by default
(2013/07/25 12:34), Hush Bensen wrote:
> On 07/25/2013 11:08 AM, Yasuaki Ishimatsu wrote:
>> (2013/07/25 9:56), Hush Bensen wrote:
>>> On 07/25/2013 12:02 AM, Toshi Kani wrote:
>>>> On Wed, 2013-07-24 at 08:18 +0800, Hush Bensen wrote:
>>>>> On 07/24/2013 04:45 AM, Toshi Kani wrote:
>>>>>> On Tue, 2013-07-23 at 10:01 +0200, Ingo Molnar wrote:
>>>>>>> * Toshi Kani <toshi.kani@...com> wrote:
>>>>>>>
>>>>>>>>> Could we please also fix it to never crash the kernel, even if stupid
>>>>>>>>> ranges are provided?
>>>>>>>> Yes, this probe interface can be enhanced to verify the firmware
>>>>>>>> information before adding a given memory address. However, such change
>>>>>>>> would interfere its test use of "fake" hotplug, which is only the known
>>>>>>>> use-case of this interface on x86.
>>>>>>> Not crashing the kernel is not a novel concept even for test interfaces...
>>>>>> Agreed.
>>>>>>
>>>>>>> Where does the possible crash come from - from using invalid RAM ranges,
>>>>>>> right? I.e. on x86 to fix the crash we need to check the RAM is present in
>>>>>>> the e820 maps, is marked RAM there, and is not already registered with the
>>>>>>> kernel, or so?
>>>>>> Yes, the crash comes from using invalid RAM ranges. How to check if the
>>>>>> RAM is present is different if the system supports hotplug or not.
>>>>> Could you explain different methods to check the RAM is present if the
>>>>> system supports hotplkug or not?
>>>> e820 and UEFI memory descriptor tables are the boot-time interfaces.
>>>> These interfaces are not required to reflect any run-time changes.
>>>>
>>>> ACPI memory device objects can be used at both boot-time and run-time,
>>>> which reflect any run-time changes. But they are optional to implement.
>>>> They typically are not implemented unless the system supports hotplug.
>>>>
>>>>>>>> In order to verify if a given memory address is enabled at run-time (as
>>>>>>>> opposed to boot-time), we need to check with ACPI memory device objects
>>>>>>>> on x86. However, system vendors tend to not implement memory device
>>>>>>>> objects unless their systems support memory hotplug. Dave Hansen is
>>>>>>>> using this interface for his testing as a way to fake a hotplug event on
>>>>>>>> a system that does not support memory hotplug.
>>>>>>> All vendors implement e820 maps for the memory present at boot time.
>>>>>> Yes for boot time. At run-time, e820 is not guaranteed to represent a
>>>>>> new memory added. Here is a quote from ACPI spec.
>>>>>>
>>>>>> ===
>>>>>> 15.1 INT 15H, E820H - Query System Address Map
>>>>>> :
>>>>>> The memory map conveyed by this interface is not required to reflect any
>>>>>> changes in available physical memory that have occurred after the BIOS
>>>>>> has initially passed control to the operating system. For example, if
>>>>>> memory is added dynamically, this interface is not required to reflect
>>>>>> the new system memory configuration.
>>>>>> ===
>>>>>>
>>>>>> By definition, the "probe" interface is used for the kernel to recognize
>>>>>> a new memory added at run-time. So, it should check ACPI memory device
>>>>>> objects (which represents run-time state) for the verification. On x86,
>>>>>> however, ACPI also sends a hotplug event to the kernel, which triggers
>>>>>> the kernel to recognize the new physical memory properly. Hence, users
>>>>>> do not need this "probe" interface.
>>>>>>
>>>>>>> How is the testing done by Dave Hansen? If it's done by booting with less
>>>>>>> RAM than available (via say the mem=1g boot parameter), and then
>>>>>>> hot-adding some of the missing RAM, then this could be made safe via the
>>>>>>> e820 maps and by consultig the physical memory maps (to avoid double
>>>>>>> registry), right?
>>>>>> If we focus on this test scenario on a system that does not support
>>>>>> hotplug, yes, I agree that we can check with e820 since it is safe to
>>>>>> assume that the system has no change after boot. IOW, it is unsafe to
>>>>>> check with e820 if the system supports hotplug, but there is no use in
>>>>>> this interface for testing if the system supports hotplug. So, this may
>>>>>> be a good idea.
>>>>>>
>>>>>> Dave, is this how you are testing? Do you always specify a valid memory
>>>>>> address for your testing?
>>>>>>
>>>>>>> How does the hotplug event based approach solve double adds? Relies on the
>>>>>>> hardware not sending a hot-add event twice for the same memory area or for
>>>>>>> an invalid memory area, or does it include fail-safes and double checks as
>>>>>>> well to avoid double adds and adding invalid memory? If yes then that
>>>>>>> could be utilized here as well.
>>>>>> In high-level, here is how ACPI memory hotplug works:
>>>>>>
>>>>>> 1. ACPI sends a hotplug event to a new ACPI memory device object that is
>>>>>> hot-added.
>>>>>> 2. The kernel is notified, and verifies if the new memory device object
>>>>>> has not been attached by any handler yet.
>>>>>> 3. The memory handler is called, and obtains a new memory range from the
>>>>>> ACPI memory device object.
>>>>>> 4. The memory handler calls add_memory() with the new address range.
>>>>>>
>>>>>> The above step 1-4 proceeds automatically within the kernel. No user
>>>>>> input (nor sysfs interface) is necessary. Step 2 prevents double adds
>>>>>> and step 3 gets a valid address range from the firmware directly. Step
>>>>>> 4 is basically the same as the "probe" interface, but with all the
>>>>>> verification up front, this step is safe.
>>>>> This is hot-added part, could you also explain how ACPI memory hotplug
>>>>> works for hot-remove?
>>>> Sure. Here is high-level.
>>>>
>>>> 1. ACPI sends a hotplug event to an ACPI memory device object that is
>>>> requested to hot-remove.
>>>> 2. The kernel is notified, and verifies if the memory device object is
>>>> attached by a handler.
>>>> 3. The memory handler is called (which is being attached), and obtains
>>>> its memory range.
>>>> 4. The memory handler calls remove_memory() with the address range.
>>>> 5. The kernel calls eject method of the ACPI memory device object.
>>>
>>> If hot remove the memory device by the hardware, or writing 1 to
>>> /sys/bus/acpi/devices/PNP0C80:XX/eject both will call eject method?
>>
>> Yes.
>> Both operations will call eject method.
>>
>>> What's the difference between these two methods? I guess the former will send SCI and the latter
>>> won't.
>>
>> Triggers are different. Former is triggered by SCI, latter is triggered by
>> writing sysfs.
>
> Thanks, another question, what's the role of udev in memory hotplug?
Udev notifies user land of hotplug events. So if you want to do something
after memory hotplug automatically, you can do it by registering your script
to udev.
Thanks,
Yasuaki Ishimatsu
>
>>
>> Thanks,
>> Yasuaki Ishimatsu
>>
>>
>>>
>>>>
>>>> Thanks,
>>>> -Toshi
>>>>
>>>>
>>>
>>
>>
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists