[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <9627eb50-9e90-4f03-9197-78b3b8a434fa@os.amperecomputing.com>
Date: Mon, 10 Mar 2025 10:14:08 -0700
From: Daniel Ferguson <danielf@...amperecomputing.com>
To: Shiju Jose <shiju.jose@...wei.com>,
"linux-edac@...r.kernel.org" <linux-edac@...r.kernel.org>,
"linux-acpi@...r.kernel.org" <linux-acpi@...r.kernel.org>,
"bp@...en8.de" <bp@...en8.de>, "tony.luck@...el.com" <tony.luck@...el.com>,
"rafael@...nel.org" <rafael@...nel.org>, "lenb@...nel.org"
<lenb@...nel.org>, "mchehab@...nel.org" <mchehab@...nel.org>,
"leo.duran@....com" <leo.duran@....com>,
"Yazen.Ghannam@....com" <Yazen.Ghannam@....com>
Cc: "linux-cxl@...r.kernel.org" <linux-cxl@...r.kernel.org>,
"dan.j.williams@...el.com" <dan.j.williams@...el.com>,
"dave@...olabs.net" <dave@...olabs.net>,
Jonathan Cameron <jonathan.cameron@...wei.com>,
"dave.jiang@...el.com" <dave.jiang@...el.com>,
"alison.schofield@...el.com" <alison.schofield@...el.com>,
"vishal.l.verma@...el.com" <vishal.l.verma@...el.com>,
"ira.weiny@...el.com" <ira.weiny@...el.com>,
"david@...hat.com" <david@...hat.com>,
"Vilas.Sridharan@....com" <Vilas.Sridharan@....com>,
"linux-mm@...ck.org" <linux-mm@...ck.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"rientjes@...gle.com" <rientjes@...gle.com>,
"jiaqiyan@...gle.com" <jiaqiyan@...gle.com>,
"Jon.Grimm@....com" <Jon.Grimm@....com>,
"dave.hansen@...ux.intel.com" <dave.hansen@...ux.intel.com>,
"naoya.horiguchi@....com" <naoya.horiguchi@....com>,
"james.morse@....com" <james.morse@....com>,
"jthoughton@...gle.com" <jthoughton@...gle.com>,
"somasundaram.a@....com" <somasundaram.a@....com>,
"erdemaktas@...gle.com" <erdemaktas@...gle.com>,
"pgonda@...gle.com" <pgonda@...gle.com>,
"duenwen@...gle.com" <duenwen@...gle.com>,
"gthelen@...gle.com" <gthelen@...gle.com>,
"wschwartz@...erecomputing.com" <wschwartz@...erecomputing.com>,
"dferguson@...erecomputing.com" <dferguson@...erecomputing.com>,
"wbs@...amperecomputing.com" <wbs@...amperecomputing.com>,
"nifan.cxl@...il.com" <nifan.cxl@...il.com>,
tanxiaofei <tanxiaofei@...wei.com>, "Zengtao (B)"
<prime.zeng@...ilicon.com>, Roberto Sassu <roberto.sassu@...wei.com>,
"kangkang.shen@...urewei.com" <kangkang.shen@...urewei.com>,
wanghuiqiang <wanghuiqiang@...wei.com>, Linuxarm <linuxarm@...wei.com>
Subject: Re: [PATCH v2 3/3] ras: mem: Add memory ACPI RAS2 driver
>>>> +static int ras2_hw_scrub_read_size(struct device *dev, void
>>>> +*drv_data, u64 *size) {
>>>> + struct ras2_mem_ctx *ras2_ctx = drv_data;
>>>> + int ret;
>>>> +
>>>> + if (ras2_ctx->bg_scrub)
>>>> + return -EBUSY;
>>>> +
>>>> + ret = ras2_update_patrol_scrub_params_cache(ras2_ctx);
>>>> + if (ret)
>>>> + return ret;
>>>> +
>>>> + *size = ras2_ctx->size;
>>>> +
>>>> + return 0;
>>>> +}
>>>
>>> Calling ras2_update_patrol_scrub_params_cache here is problematic.
>>>
>>> Imagine:
>>> echo 0x1000 > size
>>> cat size
>>> echo 0x2000000000 > addr
>>>
>>> What happens here? What happens is the scrub range is not what you
>>> expect it to be. Once you cat size, you reset the size from what you initially set
>> it to.
>>> I don't think that is what anyone will expect. It certainly caused us
>>> to stumble while testing.
>>
>> This is an expected behavior and this extra call was added here when changed
>> using attribute 'addr' to start the on-demand scrub operation instead of
>> previous separate attribute ' enable_on_demand' to start the on-demand scrub
>> operation, according to Borislav's suggestion in v13.
>>
>> Please see the following comment in the ras2_hw_scrub_read_addr() fnction,
>> "Userspace will get the status of the demand scrubbing through the address
>> range read from the firmware. When the demand scrubbing is finished
>> firmware must reset actual address range to 0. Otherwise userspace assumes
>> demand scrubbing is in progress."
Why not just use Bit[0] in the Flags register of the Parameter Block Structure
for PATROL_SCRUB? It seems having firmware reset the actual address range is
extra complexity for something we already have a facility for.
>>
>> Here sysfs attributes 'addr' and 'size' is reading the field: Actual Address Range
>> of Table 5.87: Parameter Block Structure for PATROL_SCRUB, written by the
>> firmware.
>>
>> In my opinion, reading back the address range size set in the sysfs before
>> actually writing the address range to the firmware and starting the on-demand
>> scrub operation doesn't hold much significance?
>
> After further discussion, I will add a fix for this case to return the 'size' which the user set in the sysfs
> until the scrubbing is started.
I think fixing this will make the interface less confusing, but I also agree
that it doesn't hold much significance technically.
Regards,
Daniel
>
> Thanks,
> Shiju
>>
>
Powered by blists - more mailing lists