Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 5 Sep 2014 18:24:56 -0400 (EDT)
From:      Rick Macklem <rmacklem@uoguelph.ca>
To:        Hans Petter Selasky <hps@selasky.org>
Cc:        freebsd-net@freebsd.org, FreeBSD Current <freebsd-current@freebsd.org>, Scott Long <scottl@FreeBSD.org>
Subject:   Re: [RFC] Patch to improve TSO limitation formula in general
Message-ID:  <575724560.33015896.1409955896874.JavaMail.root@uoguelph.ca>
In-Reply-To: <540A0301.9040701@selasky.org>

next in thread | previous in thread | raw e-mail | index | archive | help
Hans Petter Selasky wrote:
> Hi,
> 
> I've tested the attached patch with success and would like to have
> some
> feedback from other FreeBSD network developers. The problem is that
> the
> current TSO limitation only limits the number of bytes that can be
> transferred in a TSO packet and not the number of mbuf's.
> 
> The current solution is to have a quick and dirty custom m_dup() in
> the
> TX path to re-allocate the mbuf chains into 4K ones to make it
> simple.
> All of this hack can be avoided if the definition of the TSO limit
> can
> be changed a bit, like shown here:
> 
> 
>   /*
> + * Structure defining hardware TSO limits.
> + */
> +struct if_tso_limit {
> +       u_int raw_value[0];     /* access all fields as one */
> +       u_char frag_count;      /* maximum number of fragments:
> 1..255 */
> +       u_char frag_size_log2;  /* maximum fragment size: 2 **
> (12..16) */
> +       u_char hdr_size_log2;   /* maximum header size: 2 ** (2..8)
> */
> +       u_char reserved;        /* zero */
> +};
> 
> 
> First we need to know the maximum fragment count. Typical value is
> 32.
> Second we need to know the maximum fragment size. Typical value is
> 4K.
> Last we need to know of any headers that should be subtracted from
> the
> maximum. Hence this code is running in the fast path, I would like to
> use "u_char" for all fields and allow copy-only access as a "u_int"
> as
> an optimization. This avoids cludges and messing with additional
> header
> files.
> 
> I would like to push this patch after some more testing to -current
> and
> then to 10-stable hopefully before the coming 10-release, because the
> current solution is affecting performance of the Mellanox based
> network
> adapters in an unfair way. For example by setting the current TSO
> limit
> to 32KBytes which will be OK for all-2K fragments, we see a severe
> degradation in performance. Even though the hardware is fully capable
> of
> transmitting 16 4K mbufs.
> 
Ok, I didn't see this until now, but I will take a look at the patch.

My main comment is that I tried using a mix of 2K and 4K mbuf clusters in
NFS and was able (with some effort) get the UMA allocator all messed up, where
it would basically be stuck because it couldn't allocate boundary tags.

As such, until this issue w.r.t. UMA is rssolved, mixing MCLBYTES and MPAGESIZE
clusters is very dangerous imho. (alc@ did send me a simple patch related to this
UMA problem, but I haven't been able to test it yet.)

rick
ps: For the M_WAITOK case, the allocator problem shows up as threads sleeping
    on "btallo" which happens in vmem_bt_alloc() in kern/subr_vmem.c.
    It may never happen on 64bit arches, but it can definitely happen on i386.

> Comments and reviews are welcome!
> 
> --HPS
> 
> _______________________________________________
> freebsd-net@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-net
> To unsubscribe, send any mail to
> "freebsd-net-unsubscribe@freebsd.org"



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?575724560.33015896.1409955896874.JavaMail.root>