Date: Sat, 21 Nov 2009 08:10:14 -0900 From: Mel Flynn <mel.flynn+fbsd.questions@mailing.thruhere.net> To: Brett Glass <brett@lariat.net> Cc: questions@freebsd.org Subject: Re: kern.polling.lost_polls Message-ID: <b491a0c45ff8b78fcd75239b31bd1c9b@sbmail.office-on-the.net> In-Reply-To: <200911210207.TAA21572@lariat.net> References: <200911202135.OAA18537@lariat.net> <db2308c2d90148218fcc9209721b9920@sbmail.office-on-the.net> <200911210207.TAA21572@lariat.net>
next in thread | previous in thread | raw e-mail | index | archive | help
On Fri, 20 Nov 2009 19:07:42 -0700, Brett Glass <brett@lariat.net> wrote:
> At 06:25 PM 11/20/2009, Mel Flynn wrote:
>=20
>>So that means that you give the kernel .25 microseconds to poll and act
on
>>any pending network IO. That's probably not enough.
>=20
> I think that you mean ".25 milliseconds," not ".25 microseconds," above=
.
Yes, sorry. It should be enough, but...it's related to CPU speed and numb=
er
of interfaces. On FreeBSD-net they can give you better advice, most notab=
ly
whether all 6 interfaces are done in one poll and so each task needs to b=
e
completed within 1/HZ/N? I cannot say this with certainty.
>>It is further explained by
>>the
>>comment in sys/kern/kern_poll.c:
>>/*
>> * Hook from hardclock. Tries to schedule a netisr, but keeps track
>> * of lost ticks due to the previous handler taking too long.
>> * Normally, this should not happen, because polling handler should
>> * run for a short time. However, in some cases (e.g. when there are
>> * changes in link status etc.) the drivers take a very long time
>> * (even in the order of milliseconds) to reset and reconfigure the
>> * device, causing apparent lost polls.
>> *
>> * The first part of the code is just for debugging purposes, and trie=
s
>> * to count how often hardclock ticks are shorter than they should,
>> * meaning either stray interrupts or delayed events.
>> */
>=20
> Well, even at HZ=3D2000, kern.polling.lost_polls and=20
> kern.polling.suspect are both incrementing, as is kern.polling.stalled:
>=20
> stargate# sysctl -a | grep polling
> kern.polling.burst: 150
> kern.polling.burst_max: 150
> kern.polling.each_burst: 5
> kern.polling.idle_poll: 0
> kern.polling.user_frac: 50
> kern.polling.reg_frac: 20
> kern.polling.short_ticks: 0
> kern.polling.lost_polls: 41229
> kern.polling.pending_polls: 0
> kern.polling.residual_burst: 0
> kern.polling.handlers: 2
That bugs me: if you have 6 devices, the number of handlers should be
6.
/*
* Try to register routine for polling. Returns 0 if successful
* (and polling should be enabled), error code otherwise.
* A device is not supposed to register itself multiple times.
*
* This is called from within the *_ioctl() functions.
*/
Unless this should really read "drivers", but I think it's devices.
> kern.polling.enable: 0
> kern.polling.phase: 0
> kern.polling.suspect: 31653
> kern.polling.stalled: 10
> kern.polling.idlepoll_sleeping: 1
> hw.acpi.thermal.polling_rate: 10
>=20
> But if I slow the clock down to 1000 Hz, it's unclear if the=20
> machine will be able to keep up with traffic. I was already getting=20
> more than 1,000 network interrupts per second before I tried=20
> polling, and I'm not sure how many packets the interfaces (some=20
> fxp, some em) can buffer up. I'm going to try it, but if it doesn't=20
> work I will have to go back to interrupt-driven operation.
You might be able if your network architecture allows it, to bring down
the task load by increasing the MTU and enable jumbo frames.
>From em(4):
Support for Jumbo Frames is provided via the interface MTU setting.
Selecting an MTU larger than 1500 bytes with the ifconfig(8) utility
con=E2=80=90
figures the adapter to receive and transmit Jumbo Frames. The maximu=
m
MTU size for Jumbo Frames is 16114.
--=20
Mel
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?b491a0c45ff8b78fcd75239b31bd1c9b>
