Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 28 Jun 2013 11:57:14 +0300
From:      Alexander Motin <mav@FreeBSD.org>
To:        Konstantin Belousov <kostikbel@gmail.com>
Cc:        hackers@freebsd.org
Subject:   Re: b_freelist TAILQ/SLIST
Message-ID:  <51CD4FEA.7030605@FreeBSD.org>
In-Reply-To: <20130628065732.GL91021@kib.kiev.ua>
References:  <51CCAE14.6040504@FreeBSD.org> <20130628065732.GL91021@kib.kiev.ua>

next in thread | previous in thread | raw e-mail | index | archive | help
On 28.06.2013 09:57, Konstantin Belousov wrote:
> On Fri, Jun 28, 2013 at 12:26:44AM +0300, Alexander Motin wrote:
>> While doing some profiles of GEOM/CAM IOPS scalability, on some test
>> patterns I've noticed serious congestion with spinning on global
>> pbuf_mtx mutex inside getpbuf() and relpbuf(). Since that code is
>> already very simple, I've tried to optimize probably the only thing
>> possible there: switch bswlist from TAILQ to SLIST. As I can see,
>> b_freelist field of struct buf is really used as TAILQ in some other
>> places, so I've just added another SLIST_ENTRY field. And result
>> appeared to be surprising -- I can no longer reproduce the issue at all.
>> May be it was just unlucky synchronization of specific test, but I've
>> seen in on two different systems and rechecked results with/without
>> patch three times.
> This is too unbelievable.

I understand that it looks like a magic. I was very surprised to see 
contention there at all, but `pmcstat -n 10000000 -TS unhalted-cycles` 
shows it too often and repeatable:

PMC: [CPU_CLK_UNHALTED_CORE] Samples: 28052 (100.0%) , 12 unresolved

%SAMP IMAGE      FUNCTION             CALLERS
  46.4 kernel     __mtx_lock_sleep     relpbuf:22.3 getpbuf:22.0 
xpt_run_devq:0.8
  13.3 kernel     _mtx_lock_spin_cooki turnstile_trywait
   4.3 kernel     cpu_search_lowest    cpu_search_lowest
   2.3 kernel     getpbuf              physio

, and benchmark results confirm it.

> Could it be, e.g. some cache line conflicts
> which cause the trashing, in fact ? Does it help if you add void *b_pad
> before b_freelist instead of adding b_freeslist ?

No, this doesn't help. And previously I've tested it also with 
b_freeslist in place but without other changes -- it didn't help either.

>> The present patch is here:
>> http://people.freebsd.org/~mav/buf_slist.patch
>>
>> The question is how to do it better? What is the KPI/KBI policy for
>> struct buf? I could replace b_freelist by a union and keep KBI, but
>> partially break KPI. Or I could add another field, probably breaking
>> KBI, but keeping KPI. Or I could do something handmade with no breakage.
>> Or this change is just a bad idea?
> The same question about using union for b_freelist/b_freeslist, does the
> effect of magically fixing the contention still there if b_freeslist
> is on the same offset as the b_freelist ?

Yes, it is.

> There are no K{B,P}I policy for struct buf in HEAD, just change it as
> it fits.

Which one would you prefer, the original or 
http://people.freebsd.org/~mav/buf_slist2.patch ?

Thank you.

-- 
Alexander Motin



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?51CD4FEA.7030605>