Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 7 Dec 2019 01:16:13 +0100
From:      Hans Petter Selasky <hps@selasky.org>
To:        Alexander Motin <mav@FreeBSD.org>, sgk@troutmask.apl.washington.edu
Cc:        Warner Losh <imp@bsdimp.com>, FreeBSD Current <freebsd-current@freebsd.org>
Subject:   Re: CAM breaks USB [was Re: USB causing boot to hang]
Message-ID:  <3e5ead69-b933-70a4-a183-67552d8932fb@selasky.org>
In-Reply-To: <3df3ff25-9f62-6f0f-7823-e846a43725eb@FreeBSD.org>
References:  <20191206202316.GA1053@troutmask.apl.washington.edu> <20191206223144.GA3224@troutmask.apl.washington.edu> <CANCZdfrvTmb0xfQ_A6vLdzgNziSHSsD5TBiC1DCnriTaWcr-nw@mail.gmail.com> <20191206225231.GA949@troutmask.apl.washington.edu> <dd35656e-2cbb-f889-64dc-89d15c471e37@FreeBSD.org> <20191206234105.GA1027@troutmask.apl.washington.edu> <3df3ff25-9f62-6f0f-7823-e846a43725eb@FreeBSD.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On 2019-12-07 01:09, Alexander Motin wrote:
> On 06.12.2019 18:41, Steve Kargl wrote:
>> On Fri, Dec 06, 2019 at 06:15:32PM -0500, Alexander Motin wrote:
>>> On 06.12.2019 17:52, Steve Kargl wrote:
>>>> On Fri, Dec 06, 2019 at 03:33:09PM -0700, Warner Losh wrote:
>>>>> On Fri, Dec 6, 2019 at 3:31 PM Steve Kargl <sgk@troutmask.apl.washington.edu>
>>>>> wrote:
>>>>>> The problem seems to be caused 355010.  This is a commit to
>>>>>> fix CAM, which seems to break USB.
>>>>>>
>>>>> Yes. mav@ made this change...
>>>>>
>>>> src/UPDATING seems to be missing an entry about CAM breaking USB.
>>>
>>> And also that moon is made of cheese. :-\
>>
>> Not sure what you mean.
> 
> I mean that if we are going to write there random fairy-tales, then I
> prefer my moon.
> 
> If serious, then my change did not change semantics of any existing
> tunables, only the way some of them are implemented, so there was
> nothing to write in UPDATING.
> 
>> You made a change, and the commit log
>> even notes that there could be an issue.  Yet, you want a user
>> to waste half a day finding the root cause of the problem.
> 
> I am sorry that you wasted your time, but quick and ungrounded blames is
> the last thing I want to read on Friday evening after the long day.
> 
>>>> The commit message for 355010 states:
>>>>
>>>>     Devices appearing on USB bus later may still require setting
>>>>     kern.cam.boot_delay, but hopefully those are minority.
>>>>
>>>> There is no statement about "where" kern.cam.boot_delay should be set.
>>>> There is no statement about "what"  value(s) kern.cam.boot_delay should be.
>>>
>>> If you never needed it before, you still don't need it.
>>
>> Prior to 355010 the system just boots up.  After 355010
>> the system hangs.  Will  kern.cam.boot_delay paper over
>> whatever (latent?) bug you've exposed?
> 
> My change affected the timing of system boot process, allowing system to
> continue booting some further, not waiting for CAM to scan its buses and
> disks.  If the problem is reproducible even without USB storage, then
> CAM probably does not wait for it, so it is not the problem I first
> thought about.
> 
>>> If system hangs even without any USB disk attached, then I don't see a
>>> relation between CAM and USB here.  My change could affect some timings
>>> of the boot process, but without closer debugging it is hard to guess
>>> something.  To be sure whether USB is related I would try to disable all
>>> USB controllers either in BIOS or with set of loader tunables like
>>> hint.ehci.0.disabled=1 , hint.ohci.0.disabled=1 ,
>>> hint.xhci.0.disabled=1, ...
>>
>> Yep.  Completely disabling USB allows the system to boot.  I don't
>> see how this would be unexpected as umass using cam.
> 
> umass uses CAM, but you've told the problem happens even without umass,
> that is why I told that I don't see any relation.  Does disabling of
> _all_ USB fixes the problem?  Have you tried to narrow it down to
> specific controller or device?
> 
> Is there anything special in your system?  Are you running GENERIC
> kernel?  If not, then what do you have changed?
> 
> If your kernel includes VERBOSE_SYSINIT as GENERIC does, I would try to
> set debug.verbose_sysinit=1 and see how far the boot process goes and at
> which stage it may is hanging (if we guess that hang is related to the
> stage and not asynchronous).
> 

Hi,

There is an option you can compile into the kernel which will allow the 
keyboard to enter the debugger.

options	ALT_BREAK_TO_DEBUGGER

Sounds to me like either a leaked refcount or that one thread is 
spinning blocking execution of other threads.

--HPS



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?3e5ead69-b933-70a4-a183-67552d8932fb>