Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 26 Jan 2014 02:12:51 GMT
From:      Yury <hawk256@yandex.ru>
To:        freebsd-gnats-submit@FreeBSD.org
Subject:   amd64/186114: MPD5.7 umtxn
Message-ID:  <201401260212.s0Q2Cp60012677@oldred.freebsd.org>
Resent-Message-ID: <201401260220.s0Q2K0xf018653@freefall.freebsd.org>

next in thread | raw e-mail | index | archive | help

>Number:         186114
>Category:       amd64
>Synopsis:       MPD5.7 umtxn
>Confidential:   no
>Severity:       non-critical
>Priority:       low
>Responsible:    freebsd-amd64
>State:          open
>Quarter:        
>Keywords:       
>Date-Required:
>Class:          update
>Submitter-Id:   current-users
>Arrival-Date:   Sun Jan 26 02:20:00 UTC 2014
>Closed-Date:
>Last-Modified:
>Originator:     Yury
>Release:        FreeBSD 10.0
>Organization:
GreenLine
>Environment:
FreeBSD gw01.comteks.biz 10.0-STABLE FreeBSD 10.0-STABLE #0 r261173: Sun Jan 26 03:58:09 MSK 2014     hawk@gw01.comteks.biz:/usr/obj/usr/src/sys/Hawk  amd64

>Description:
I have BRAS on FreeBSD. It was 9.2 STABLE. I tried to update it up to 10.0 RELEASE, later tried to STABLE. On both variants I have the same problem.

Some time after start, around 5 minutes, it works normally. But after 100-150 users have connected trough PPPoE (MPD5.7) MPD process stops in state umtxn.

Of course, no one can connect after that. But who have already connected keeping work.

last pid: 17712;  load averages:  1.16,  0.65,  0.27          up 0+00:01:51  05:28:23
50 processes:  1 running, 49 sleeping
CPU:  0.0% user,  0.0% nice,  1.0% system,  0.9% interrupt, 98.1% idle
Mem: 1162M Active, 56M Inact, 400M Wired, 145M Buf, 2274M Free
Swap: 4096M Total, 4096M Free

  PID USERNAME    THR PRI NICE   SIZE    RES STATE   C   TIME    WCPU COMMAND
 2535 root          1  20    0   201M   184M select  3   1:14  10.69% zebra
 2476 _pflogd       1  20    0 14600K  2200K bpf     0   0:12   0.00% pflogd
 2541 root          1  20    0   224M   206M select  2   0:07   0.00% bgpd
 9803 root          1  20    0 78624K 44092K select  2   0:02   0.00% bsnmpd
 3462 root          3  20    0 56736K  9164K umtxn   0   0:01   0.00% mpd5
 7243 mysql        17  32    0  6958M   636M uwait   1   0:01   0.00% mysqld
 6095 bind          7  20    0   129M 76864K kqread  1   0:01   0.00% named
 3872 root          1  20    0 61124K  6808K select  1   0:00   0.00% nmbd
 8644 root          3  20    0 47332K  6216K select  1   0:00   0.00% utm5_rfw


procstat -k 3462
  PID    TID COMM             TDNAME           KSTACK
 3462 100113 mpd5             -                mi_switch sleepq_catch_signals sleepq_wait_sig _sleep umtxq_sleep do_lock_umutex __umtx_op_wait_umutex amd64_syscall Xfast_syscall
 3462 100115 mpd5             -                mi_switch sleepq_catch_signals sleepq_wait_sig _cv_wait_sig seltdwait sys_poll amd64_syscall Xfast_syscall
 3462 100512 mpd5             -                mi_switch sleepq_catch_signals sleepq_wait_sig _sleep umtxq_sleep do_lock_umutex __umtx_op_wait_umutex amd64_syscall Xfast_syscall



/var/log/mpd.log
Jan 26 05:28:13 gw01 mpd: [B_ppp-46] IPCP: Up event
Jan 26 05:28:13 gw01 mpd: [B_ppp-46] IPCP: state change Starting --> Req-Sent
Jan 26 05:28:13 gw01 mpd: [B_ppp-46] IPCP: SendConfigReq #1
Jan 26 05:28:13 gw01 mpd: [B_ppp-46]   IPADDR 10.10.0.1
Jan 26 05:28:13 gw01 mpd: [B_ppp-46] IPV6CP: Up event
Jan 26 05:28:13 gw01 mpd: [B_ppp-46] IPV6CP: state change Starting --> Req-Sent
Jan 26 05:28:13 gw01 mpd: [B_ppp-46] IPV6CP: SendConfigReq #1
Jan 26 05:28:13 gw01 mpd: [vlan6-107] LCP: rec'd Terminate Request #240 (Opened)
Jan 26 05:28:13 gw01 mpd: [vlan6-107] LCP: state change Opened --> Stopping
Jan 26 05:28:13 gw01 mpd: [vlan6-107] Link: Leave bundle "B_ppp-46"

It always stops with the same 3 last strings.



Jan 26 05:52:38 gw01 kernel: sonewconn: pcb 0xfffff80007757c40: Listen queue overflow:
 4 already in queue awaiting acceptance
Jan 26 05:53:09 gw01 last message repeated 60 times
Jan 26 05:53:34 gw01 last message repeated 51 times


Kernel conf:
GENERIC + 
device          ipmi
device          coretemp
device          smbus

device          lagg
device          netmap

options         IPI_PREEMPTION

options         IPFIREWALL
options         IPFIREWALL_VERBOSE
options         IPDIVERT
options         DUMMYNET
options         IPFIREWALL_NAT
options         LIBALIAS

device          pf
device          pflog
device          pfsync

options         ALTQ
options         ALTQ_CBQ        # Class Bases Queuing (CBQ)
options         ALTQ_RED        # Random Early Detection (RED)
options         ALTQ_RIO        # RED In/Out
options         ALTQ_HFSC       # Hierarchical Packet Scheduler (HFSC)
options         ALTQ_PRIQ       # Priority Queuing (PRIQ)
options         ALTQ_NOPCC      # Required for SMP build

options         NETGRAPH
options         NETGRAPH_BPF
options         NETGRAPH_CAR
options         NETGRAPH_ETHER
options         NETGRAPH_IPFW
options         NETGRAPH_IFACE
options         NETGRAPH_KSOCKET
options         NETGRAPH_PPP
options         NETGRAPH_PPTPGRE
options         NETGRAPH_PPPOE
options         NETGRAPH_SOCKET
options         NETGRAPH_TCPMSS
options         NETGRAPH_TEE
options         NETGRAPH_VJC
options         NETGRAPH_MPPC_ENCRYPTION
options         NETGRAPH_NETFLOW



CPU: Intel(R) Xeon(R) CPU           X3470  @ 2.93GHz (2933.36-MHz K8-class CPU)
  Origin = "GenuineIntel"  Id = 0x106e5  Family = 0x6  Model = 0x1e  Stepping = 5
  Features=0xbfebfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE>
  Features2=0x98e3fd<SSE3,DTES64,MON,DS_CPL,VMX,SMX,EST,TM2,SSSE3,CX16,xTPR,PDCM,SSE4.1,SSE4.2,POPCNT>
  AMD Features=0x28100800<SYSCALL,NX,RDTSCP,LM>
  AMD Features2=0x1<LAHF>
  TSC: P-state invariant, performance statistics
real memory  = 4294967296 (4096 MB)
avail memory = 4052344832 (3864 MB)
Event timer "LAPIC" quality 400
ACPI APIC Table: <INTEL  S3420GPC>
FreeBSD/SMP: Multiprocessor System Detected: 4 CPUs
FreeBSD/SMP: 1 package(s) x 4 core(s)
 cpu0 (BSP): APIC ID:  0
 cpu1 (AP): APIC ID:  2
 cpu2 (AP): APIC ID:  4
 cpu3 (AP): APIC ID:  6


I tried to get ktrace dump. But I could not open it.
ktrdump: kvm_nlist: No such file or directory


I think, It is something wrong with netgraph system.
>How-To-Repeat:
Update to FreeBSD 10.0 and try to connect 100-150 users.
>Fix:


>Release-Note:
>Audit-Trail:
>Unformatted:



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?201401260212.s0Q2Cp60012677>