Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 10 Jul 2014 14:24:17 -0500
From:      Bryan Drewery <bryan-lists@shatow.net>
To:        freebsd-hackers@freebsd.org
Subject:   Re: kldstat / kernel linker deadlock
Message-ID:  <53BEE861.1020603@shatow.net>
In-Reply-To: <201301111613.38676.jhb@freebsd.org>
References:  <50AED0B9.7040108@shatow.net> <201301111613.38676.jhb@freebsd.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On 1/11/2013 3:13 PM, John Baldwin wrote:
> On Thursday, November 22, 2012 08:26:17 PM Bryan Drewery wrote:
>> On 8.3-RELEASE I've hit a deadlock with kldstat.
>>
>> I can't provide much information as procstat(1) locks up and I have
>> already rebooted the servers due to it breaking quite a bit in my setup.
>>
>>> # kldstat
>>> Id Refs Address    Size     Name
>>> load: 0.91  cmd: kldstat 9936 [kernel linker] 51.21r 0.00u 0.00s 0% 768k
>>> ^C
>>> load: 0.72  cmd: kldstat 9936 [kernel linker] 225.23r 0.00u 0.00s 0% 704k
>>> load: 0.72  cmd: kldstat 9936 [kernel linker] 225.39r 0.00u 0.00s 0% 704k
>>> load: 0.42  cmd: kldstat 9936 [kernel linker] 1837.24r 0.00u 0.00s 0%
>>> 692k
>>
>> Short list of affected processes (74 in all):
>>> root        3685  0.0  0.0  3264   700  ??  D     7:27PM   0:00.00
>>> kldstat root       67061  0.0  0.0  3380   892  ??  D     7:27PM  
>>> 0:00.00 /usr/bin/netstat -nrf inet root        5579  0.0  0.0  3380  
>>> 892  ??  D     7:37PM   0:00.00 /usr/bin/netstat -nrf inet root       
>>> 6393  0.0  0.0  3264   704  ??  D     7:32PM   0:00.00 /sbin/kldstat -v
>>> root       99635  0.0  0.1  3324  1244  13  D+    7:52PM   0:00.01
>>> procstat -ka
>>
>> [... 69 more removed ...]
>>
>> I had 2 minutely cron entries that were running kldstat(1)/netstat(1).
>>
>> Guessing the kldstat(1) and netstat(1) deadlocked initially.
> 
> Next time get a dump if at all possible.
> 

Pretty sure this was fixed in r224546 which did not get MFC'd to
releng/8.3 before release.

> r224546 | glebius | 2011-07-31 08:49:15 -0500 (Sun, 31 Jul 2011) | 4 lines
> Changed paths:
>    M /head/sys/kern/kern_linker.c
> 
> Don't leak kld_sx lock in kldunloadf().
> @@ -1108,12 +1108,13 @@ kern_kldunload(struct thread *td, int fi
>  #ifdef HWPMC_HOOKS
>         if (error == 0) {
>                 KLD_DOWNGRADE();
>                 PMC_CALL_HOOK(td, PMC_FN_KLD_UNLOAD, (void *) &pkm);
>                 KLD_UNLOCK_READ();
>         } else
> +               KLD_UNLOCK();
>  #else
>                 KLD_UNLOCK();
>  #endif


I reviewed my logs from the day and found that the initial command I ran
was 'kldunload linux' due to SA-12:08.linux.asc. The module was in use
due to running processes using the linuxelf brand so the module had
returned EBUSY. I also had HWPMC_HOOKS since it was in GENERIC. When
EBUSY was returned the kld lock was kept xlocked.

-- 
Regards,
Bryan Drewery




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?53BEE861.1020603>