Skip site navigation (1)Skip section navigation (2)
Date:      Mon,  1 Sep 2008 18:46:21 +0400 (MSD)
From:      Anton Yuzhaninov <citrin@citrin.ru>
To:        FreeBSD-gnats-submit@FreeBSD.org
Subject:   kern/127024: Problem with unix sockets garbage collector
Message-ID:  <20080901144621.8751589A47D@mx22.rambler.ru>
Resent-Message-ID: <200809011510.m81FA2WT093644@freefall.freebsd.org>

next in thread | raw e-mail | index | archive | help

>Number:         127024
>Category:       kern
>Synopsis:       Problem with unix sockets garbage collector
>Confidential:   no
>Severity:       non-critical
>Priority:       medium
>Responsible:    freebsd-bugs
>State:          open
>Quarter:        
>Keywords:       
>Date-Required:
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Mon Sep 01 15:10:00 UTC 2008
>Closed-Date:
>Last-Modified:
>Originator:     Anton Yuzhaninov
>Release:        FreeBSD 7.0-STABLE amd64
>Organization:
Rambler
>Environment:
System: FreeBSD mx22.rambler.ru 7.0-STABLE FreeBSD 7.0-STABLE #1: Fri Jun 27 16:59:59 MSD 2008 root@mx22.rambler.ru:/usr/obj/usr/src/sys/MAIL amd64

Problem occurs on SMP boxes, when unix sockets used under high load.
In our case it is server with postfix MTA, where unix sockets used for IPC.

>Description:
1. Normal work (after reboot):

thread taskq in top is about 0.00% WCPU

sysctl net.local.inflight is almost always zero.
sysctl net.local.taskcount value increased rarely.

2. After several days of work thread taskq starts to eat all available CPU:

1684 processes:26 running, 1639 sleeping, 19 waiting
CPU states:  6.7% user,  0.0% nice, 54.5% system,  1.1% interrupt, 37.7% idle
Mem: 1332M Active, 1903M Inact, 505M Wired, 118M Cache, 214M Buf, 76M Free
Swap: 2060M Total, 2060M Free

   PID USERNAME  THR PRI NICE   SIZE    RES STATE  C   TIME   WCPU COMMAND
     9 root        1   8    -     0K    16K CPU1   1 536:07 100.00% thread taskq
    12 root        1 171 ki31     0K    16K RUN    0  53.5H 64.06% idle: cpu0
    11 root        1 171 ki31     0K    16K RUN    1  50.3H 14.26% idle: cpu1

sysctl net.local.inflight value is always less then 0 (I see values from -1 to -4).
sysctl net.local.taskcount values increased with high rate (about 100 per second).

It seems to be some race in unix sockets code, because on uniprocessor box we can't repeat this.

>How-To-Repeat:
Run postfix MTA on high loaded mail server (> 100 connects per second) with 6-stable or 7-stable (SMP).
Problem should occurs after several days (weeks) of uptime.

>Fix:
Not known yet.
May be in 8-current this problem fixed, but we can't run 8-current on this hardware.
>Release-Note:
>Audit-Trail:
>Unformatted:



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20080901144621.8751589A47D>