Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 6 Jul 2020 20:26:49 +0200
From:      Jan Bramkamp <crest@rlwinm.de>
To:        freebsd-geom@freebsd.org
Subject:   Re: Single-threaded bottleneck in geli
Message-ID:  <8ca74f4b-8935-b089-675d-df9b7fc2ae50@rlwinm.de>
In-Reply-To: <550d61f9-506e-710c-8800-4f13143cf976@rlwinm.de>
References:  <CAOtMX2hFaNCmwkuhfWWqNwACETtomnJroTC1_giOO_iFj0SKFQ@mail.gmail.com> <550d61f9-506e-710c-8800-4f13143cf976@rlwinm.de>

next in thread | previous in thread | raw e-mail | index | archive | help
On 06.07.20 20:21, Jan Bramkamp wrote:
> On 03.07.20 21:30, Alan Somers wrote:
>> I'm using geli, gmultipath, and ZFS on a large system, with hundreds of
>> drives.  What I'm seeing is that under at least some workloads, the 
>> overall
>> performance is limited by the single geom kernel process. procstat and
>> kgdb aren't much help in telling exactly why this process is using so 
>> much
>> CPU, but it certainly must be related to the fact that over 15,000 
>> IOPs are
>> going through that thread.  What can I do to improve this situation?  
>> Would
>> it make sense to enable direct dispatch for geli?  That would hurt
>> single-threaded performance, but probably improve performance for highly
>> multithreaded workloads like mine.
>>
>> Example top output:
>>    PID USERNAME    PRI NICE   SIZE    RES STATE    C   TIME WCPU COMMAND
>>     13 root         -8    -     0B    96K CPU46   46  82.7H 70.54%
>> geom{g_down}
>>     13 root         -8    -     0B    96K -        9  35.5H 25.32%
>> geom{g_up}
>>
>> -Alan
>
> The problem isn't GELI. It's the problem is that gmultipath lacks 
> direct dispatch support. Last one and a half years ago I ran into the 
> same problem. Because I needed the performance I looked at what 
> gmultipath did and found now reason why it has run in the GEOM up and 
> down threads. So i patched in the flags claiming direct dispatch 
> support. It improved my read performance from 2.2GB/s to 3.4GB/s and 
> write performance from 750MB/s to 1.5GB/s the system worked for a few 
> days under high load (saturated a 2 x 10Gb/s lagg(4) as read only 
> WebDAV server and while receiving uploads via SFTP). It worked until I 
> attempted to shutdown the system. It hung on shutdown an never powered 
> off. I had to power cycle the box via IPMI to recover. I never found 
> the time to debug this problem.
>
The server in question was an old Westmere dual hexacore with 96GB RAM 
and had 42 (+ 3 spares) 7200rpm SAS2 disks in a dual ported JBOD 
connected via 4 x 4 lanes to two LSI2?08 HBAs with IT firmware 
configured as 3way mirrored ZFS pool with GELI underneath.




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?8ca74f4b-8935-b089-675d-df9b7fc2ae50>