[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1f9d5190-97f2-f98e-c7c4-80e259346e91@huawei.com>
Date: Thu, 15 Jun 2017 09:00:54 +0100
From: John Garry <john.garry@...wei.com>
To: wangyijing <wangyijing@...wei.com>,
Johannes Thumshirn <jthumshirn@...e.de>,
<jejb@...ux.vnet.ibm.com>, <martin.petersen@...cle.com>
CC: <chenqilin2@...wei.com>, <hare@...e.com>,
<linux-scsi@...r.kernel.org>, <linux-kernel@...r.kernel.org>,
<chenxiang66@...ilicon.com>, <huangdaode@...ilicon.com>,
<wangkefeng.wang@...wei.com>, <zhaohongjiang@...wei.com>,
<dingtianhong@...wei.com>, <guohanjun@...wei.com>,
<yanaijie@...wei.com>, <hch@....de>, <dan.j.williams@...el.com>,
<emilne@...hat.com>, <thenzl@...hat.com>, <wefu@...hat.com>,
<charles.chenxin@...wei.com>, <chenweilong@...wei.com>,
Yousong He <heyousong@...wei.com>
Subject: Re: [PATCH v2 1/2] libsas: Don't process sas events in static works
On 15/06/2017 08:37, wangyijing wrote:
>
>
> 在 2017/6/14 21:08, John Garry 写道:
>> On 14/06/2017 10:04, wangyijing wrote:
>>>>> static void notify_ha_event(struct sas_ha_struct *sas_ha, enum ha_event event)
>>>>>>> {
>>>>>>> + struct sas_ha_event *ev;
>>>>>>> +
>>>>>>> BUG_ON(event >= HA_NUM_EVENTS);
>>>>>>>
>>>>>>> - sas_queue_event(event, &sas_ha->pending,
>>>>>>> - &sas_ha->ha_events[event].work, sas_ha);
>>>>>>> + ev = kzalloc(sizeof(*ev), GFP_ATOMIC);
>>>>>>> + if (!ev)
>>>>>>> + return;
>>>>> GFP_ATOMIC allocations can fail and then no events will be queued *and* we
>>>>> don't report the error back to the caller.
>>>>>
>>> Yes, it's really a problem, but I don't find a better solution, do you have some suggestion ?
>>>
>>
>> Dan raised an issue with this approach, regarding a malfunctioning PHY which spews out events. I still don't think we're handling it safely. Here's the suggestion:
>> - each asd_sas_phy owns a finite-sized pool of events
>> - when the event pool becomes exhausted, libsas stops queuing events (obviously) and disables the PHY in the LLDD
>> - upon attempting to re-enable the PHY from sysfs, libsas first checks that the pool is still not exhausted
>>
>> If you cannot find a good solution, then let us know and we can help.
>
> Hi John and Dan, what's event you found on malfunctioning PHY, if the event is PORTE_BROADCAST_RCVD, since
> every PORTE_BROADCAST_RCVD libsas always call sas_revalidate_domain(), what about keeping a broadcast waiting(not queued in workqueue)
> and discard others. If the event is other types, things may become knotty.
>
As I mentioned in the v1 series discussion, I found a poorly connected
expander PHY was spewing out PHY up and loss of signal events
continuously. This is the sort of situation we should protect against.
Current solution is ok, as it uses a static event per port/PHY/HA.
The point is that we cannot allow a PHY to continuously send events to
libsas, which may lead to memory exhaustion.
John
>
>>
>> John
>>
>>
>> .
>>
>
>
> .
>
Powered by blists - more mailing lists