Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 30 Oct 2015 22:44:16 +0100
From:      Tijl Coosemans <tijl@FreeBSD.org>
To:        Konstantin Belousov <kostikbel@gmail.com>
Cc:        Jeff Roberson <jeff@FreeBSD.org>, src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-head@freebsd.org
Subject:   Re: svn commit: r289279 - in head/sys: kern vm
Message-ID:  <20151030224416.51580e0f@kalimero.tijl.coosemans.org>
In-Reply-To: <20151029203334.GA2257@kib.kiev.ua>
References:  <201510140210.t9E2A79H056595@repo.freebsd.org> <20151029212554.799f76eb@kalimero.tijl.coosemans.org> <20151029203334.GA2257@kib.kiev.ua>

next in thread | previous in thread | raw e-mail | index | archive | help
On Thu, 29 Oct 2015 22:33:34 +0200 Konstantin Belousov <kostikbel@gmail.com> wrote:
> On Thu, Oct 29, 2015 at 09:25:54PM +0100, Tijl Coosemans wrote:
>> On Wed, 14 Oct 2015 02:10:07 +0000 (UTC) Jeff Roberson <jeff@FreeBSD.org> wrote:  
>>> Author: jeff
>>> Date: Wed Oct 14 02:10:07 2015
>>> New Revision: 289279
>>> URL: https://svnweb.freebsd.org/changeset/base/289279
>>> 
>>> Log:
>>>   Parallelize the buffer cache and rewrite getnewbuf().  This results in a
>>>   8x performance improvement in a micro benchmark on a 4 socket machine.
>>>   
>>>    - Get buffer headers from a per-cpu uma cache that sits in from of the
>>>      free queue.
>>>    - Use a per-cpu quantum cache in vmem to eliminate contention for kva.
>>>    - Use multiple clean queues according to buffer cache size to eliminate
>>>      clean queue lock contention.
>>>    - Introduce a bufspace daemon that attempts to prevent getnewbuf() callers
>>>      from blocking or doing direct recycling.
>>>    - Close some bufspace allocation races that could lead to endless
>>>      recycling.
>>>    - Further the transition to a more modern style of small functions grouped
>>>      by prefix in order to improve growing complexity.  
>> 
>> I have an i386 system that locks up easily after this commit.  Booting
>> into single user and running make installkernel triggers it consistently.
>> I haven't been able to reproduce it on amd64.  Examining threads with
>> DDB shows that they are all at sched_switch (a few at fork_trampoline).
>> The only lock being held is Giant (by the interrupt handler for
>> ctrl+alt+esc I think).  So it doesn't look like a dead lock.  It's more
>> a sleeping beauty situation.  All threads in the castle are sleeping and
>> there's no prince to wake them up.
>> 
>> (kgdb) info thread
>>   Id   Target Id         Frame
>> 
>> These are from make installkernel:
>> 
>>   72   Thread 100071 (PID=107: install) sched_switch (td=0xc667d000, 
>>     newtd=0xc6407000, flags=<optimized out>)
>>     at /usr/src/sys/kern/sched_ule.c:1969
>>   71   Thread 100070 (PID=81: make) sched_switch (td=0xc667d340, 
>>     newtd=0xc667d000, flags=<optimized out>)
>>     at /usr/src/sys/kern/sched_ule.c:1969
>>   70   Thread 100067 (PID=30: make) sched_switch (td=0xc667e000, 
>>     newtd=0xc667d340, flags=<optimized out>)
>>     at /usr/src/sys/kern/sched_ule.c:1969
>>   69   Thread 100066 (PID=25: make) sched_switch (td=0xc667e340, 
>>     newtd=0xc667e000, flags=<optimized out>)
>>     at /usr/src/sys/kern/sched_ule.c:1969
>> 
>> Single user shell:
>> 
>>   68   Thread 100065 (PID=17: sh) sched_switch (td=0xc6406000, 
>>     newtd=0xc667e340, flags=<optimized out>)
>>     at /usr/src/sys/kern/sched_ule.c:1969
>> 
>> Kernel threads:
>> 
>>   67   Thread 100063 (PID=16: vnlru) sched_switch (td=0xc6406680, 
>>     newtd=0xc6407340, flags=<optimized out>)
>>     at /usr/src/sys/kern/sched_ule.c:1969
>>   66   Thread 100062 (PID=9: syncer) sched_switch (td=0xc64069c0, 
>>     newtd=0xc667d000, flags=<optimized out>)
>>     at /usr/src/sys/kern/sched_ule.c:1969
>>   65   Thread 100061 (PID=8: bufspacedaemon) sched_switch (td=0xc6407000, 
>>     newtd=0xc62dc000, flags=<optimized out>)
>>     at /usr/src/sys/kern/sched_ule.c:1969
>>   64   Thread 100060 (PID=7: bufdaemon) sched_switch (td=0xc6407340, 
>>     newtd=0xc6408000, flags=<optimized out>)
>>     at /usr/src/sys/kern/sched_ule.c:1969
>>   63   Thread 100068 (PID=7: bufdaemon//var worker) sched_switch (
>>     td=0xc667d9c0, newtd=0xc6407000, flags=<optimized out>)
>>     at /usr/src/sys/kern/sched_ule.c:1969
>>   62   Thread 100069 (PID=7: bufdaemon//usr worker) sched_switch (
>>     td=0xc667d680, newtd=0xc667d000, flags=<optimized out>)
>>     at /usr/src/sys/kern/sched_ule.c:1969
>>   61   Thread 100059 (PID=6: pagezero) sched_switch (td=0xc6407680, 
>>     newtd=0xc55ba680, flags=<optimized out>)
>>     at /usr/src/sys/kern/sched_ule.c:1969
>>   60   Thread 100058 (PID=5: vmdaemon) sched_switch (td=0xc64079c0, 
>>     newtd=0xc6407340, flags=<optimized out>)
>>     at /usr/src/sys/kern/sched_ule.c:1969
>>   59   Thread 100057 (PID=4: pagedaemon) sched_switch (td=0xc6408000, 
>>     newtd=0xc6407000, flags=<optimized out>)
>>     at /usr/src/sys/kern/sched_ule.c:1969
>>   58   Thread 100064 (PID=4: pagedaemon/uma) sched_switch (td=0xc6406340, 
>>     newtd=0xc55b9340, flags=<optimized out>)
>>     at /usr/src/sys/kern/sched_ule.c:1969
>>   57   Thread 100050 (PID=15: acpi_cooling0) sched_switch (td=0xc62dc340, 
>>     newtd=0xc6407000, flags=<optimized out>)
>>     at /usr/src/sys/kern/sched_ule.c:1969
>> ....
>> 
>> Anything else you need to debug this?  
> 
> Start with gathering the information listed in
> https://www.freebsd.org/doc/en_US.ISO8859-1/books/developers-handbook/kerneldebug-deadlocks.html

https://people.freebsd.org/~tijl/r289279-dead.txt

r290155 doesn't fix it by the way.



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20151030224416.51580e0f>