Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 18 Jul 2006 13:17:32 +0100
From:      Gareth McCaughan <gmccaughan@synaptics-uk.com>
To:        freebsd-hackers@freebsd.org
Subject:   "swiN: clock sio" process taking 75% CPU
Message-ID:  <200607181317.33416.gmccaughan@synaptics-uk.com>

next in thread | raw e-mail | index | archive | help
(I've asked this question on -questions and -stable, with no success;
hence I'm taking it to the assembled wizardry of -hackers. A bit of
googling suggests that I'm far from the first person to have had a
similar problem, though it seems to be worse for me than for the
others I've found.)

I have a box running 6-STABLE, cvsupped last week.
Until recently it was running 5.something and showed
the same peculiar behaviour as I'm about to describe.
Further back, it used to run 4.x, and I don't recall
anything like this happening then.

About 6 minutes after booting (on three occasions, but I
don't guarantee this doesn't vary), a process (well, a
kernel interrupt thread, I guess) that appears in the
output of "ps" as "[swi4: clock sio]" begins to use
about 3/4 of the machine's CPU. I think it does so
more or less instantaneously. It continues to do so
indefinitely, so far as I can tell.

I'm not aware of anything specific that triggers this,
though I suppose there must *be* something. It happens
apparently spontaneously, on a lightly loaded machine.

Those cycles are genuinely being consumed; other processes
run much more slowly than they "should", and take much more
wall time than CPU time.

I've tried diddling my kernel's HZ value; the behaviour
with HZ=100 and with HZ=1000 is the same, so far as I'm
able to tell. I've no idea whether it might be relevant,
but I have option DEVICE_POLLING turned on; toggling
sysctl kern.polling.enable doesn't seem to make any
difference.

The machine is a very uninteresting single-CPU Athlon box,
clocked at 1.6GHz, several years old. Here's its dmesg output,
with a few uninteresting bits of information leakage elided.

---------- dmesg output begins ----------
Copyright (c) 1992-2006 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
        The Regents of the University of California. All rights reserved.
FreeBSD 6.1-STABLE #3: Thu Jul 13 16:46:07 BST 2006
    root@xxxxxxx.xxxxxxxx.com:/usr/obj/usr/src/sys/xxxxxxxx
ACPI APIC Table: <AMIINT VIA_K7  >
Timecounter "i8254" frequency 1193182 Hz quality 0
CPU: AMD Athlon(tm) XP 1900+ (1599.73-MHz 686-class CPU)
  Origin = "AuthenticAMD"  Id = 0x662  Stepping = 2
  Features=0x383fbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,MMX,FXSR,SSE>
  AMD Features=0xc0480800<SYSCALL,MP,MMX+,3DNow+,3DNow>
real memory  = 536805376 (511 MB)
avail memory = 515944448 (492 MB)
ioapic0 <Version 1.1> irqs 0-23 on motherboard
netsmb_dev: loaded
acpi0: <AMIINT VIA_K7> on motherboard
acpi0: Power Button (fixed)
Timecounter "ACPI-fast" frequency 3579545 Hz quality 1000
acpi_timer0: <24-bit timer at 3.579545MHz> port 0x808-0x80b on acpi0
cpu0: <ACPI CPU> on acpi0
acpi_throttle0: <ACPI CPU Throttling> on cpu0
acpi_button0: <Power Button> on acpi0
pcib0: <ACPI Host-PCI bridge> port 0xcf8-0xcff on acpi0
pci0: <ACPI PCI bus> on pcib0
agp0: <VIA 8367 (KT266/KY266x/KT333) host to PCI bridge> mem 0xe0000000-0xe3ffffff at device 0.0 on pci0
pcib1: <PCI-PCI bridge> at device 1.0 on pci0
pci1: <PCI bus> on pcib1
pci1: <display, VGA> at device 0.0 (no driver attached)
xl0: <3Com 3c905C-TX Fast Etherlink XL> port 0xec00-0xec7f mem 0xdfffff80-0xdfffffff irq 17 at device 6.0 on pci0
miibus0: <MII bus> on xl0
ukphy0: <Generic IEEE 802.3u media interface> on miibus0
ukphy0:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto
xl0: Ethernet address: 00:04:75:7e:54:fd
isab0: <PCI-ISA bridge> at device 17.0 on pci0
isa0: <ISA bus> on isab0
atapci0: <VIA 8233 UDMA100 controller> port 0x1f0-0x1f7,0x3f6,0x170-0x177,0x376,0xfc00-0xfc0f at device 17.1 on pci0
ata0: <ATA channel 0> on atapci0
ata1: <ATA channel 1> on atapci0
uhci0: <VIA 83C572 USB controller> port 0xdc00-0xdc1f irq 5 at device 17.2 on pci0
uhci0: [GIANT-LOCKED]
usb0: <VIA 83C572 USB controller> on uhci0
usb0: USB revision 1.0
uhub0: VIA UHCI root hub, class 9/0, rev 1.00/1.00, addr 1
uhub0: 2 ports with 2 removable, self powered
uhci1: <VIA 83C572 USB controller> port 0xe000-0xe01f irq 5 at device 17.3 on pci0
uhci1: [GIANT-LOCKED]
usb1: <VIA 83C572 USB controller> on uhci1
usb1: USB revision 1.0
uhub1: VIA UHCI root hub, class 9/0, rev 1.00/1.00, addr 1
uhub1: 2 ports with 2 removable, self powered
uhci2: <VIA 83C572 USB controller> port 0xe400-0xe41f irq 5 at device 17.4 on pci0
uhci2: [GIANT-LOCKED]
usb2: <VIA 83C572 USB controller> on uhci2
usb2: USB revision 1.0
uhub2: VIA UHCI root hub, class 9/0, rev 1.00/1.00, addr 1
uhub2: 2 ports with 2 removable, self powered
pci0: <multimedia, audio> at device 17.5 (no driver attached)
acpi_button1: <Sleep Button> on acpi0
fdc0: <floppy drive controller> port 0x3f2-0x3f3,0x3f4-0x3f5,0x3f7 irq 6 drq 2 on acpi0
fdc0: [FAST]
fd0: <1440-KB 3.5" drive> on fdc0 drive 0
sio0: <16550A-compatible COM port> port 0x3f8-0x3ff irq 4 flags 0x10 on acpi0
sio0: type 16550A
sio1: <16550A-compatible COM port> port 0x2f8-0x2ff irq 3 on acpi0
sio1: type 16550A
ppc0: <ECP parallel printer port> port 0x378-0x37f,0x778-0x77b irq 7 drq 3 on acpi0
ppc0: SMC-like chipset (ECP/EPP/PS2/NIBBLE) in COMPATIBLE mode
ppc0: FIFO with 16/16/9 bytes threshold
ppbus0: <Parallel port bus> on ppc0
plip0: <PLIP network interface> on ppbus0
lpt0: <Printer> on ppbus0
lpt0: Interrupt-driven port
ppi0: <Parallel I/O> on ppbus0
atkbdc0: <Keyboard controller (i8042)> port 0x60,0x64 irq 1 on acpi0
atkbd0: <AT Keyboard> irq 1 on atkbdc0
kbd0 at atkbd0
atkbd0: [GIANT-LOCKED]
psm0: <PS/2 Mouse> irq 12 on atkbdc0
psm0: [GIANT-LOCKED]
psm0: model Generic PS/2 mouse, device ID 0
pmtimer0 on isa0
sc0: <System console> at flags 0x100 on isa0
sc0: VGA <16 virtual consoles, flags=0x300>
vga0: <Generic ISA VGA> at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0
Timecounter "TSC" frequency 1599730245 Hz quality 800
Timecounters tick every 10.000 msec
ad0: 38166MB <Seagate ST340014A 3.06> at ata0-master UDMA100
acd0: DVDROM <SAMSUNG DVD-ROM SD-616F/F101> at ata1-master UDMA33
Trying to mount root from ufs:/dev/ad0s1a
---------- dmesg output ends ----------

vmstat -i produces pretty normal-looking numbers (this is while
running with HZ=100):

  | interrupt                          total       rate
  | irq1: atkbd0                           3          0
  | irq6: fdc0                            10          0
  | irq14: ata0                         2913          1
  | irq15: ata1                           47          0
  | irq17: xl0                          7342          4
  | cpu0: timer                       302649        199
  | Total                             312964        206

A little more information, in case it's useful:

  | $ echo; sysctl debug | egrep to_
  | debug.to_avg_mpcalls: 2890
  | debug.to_avg_mtxcalls: 0
  | debug.to_avg_gcalls: 768
  | debug.to_avg_depth: 3815

That's with HZ = 100. Here are some numbers from a message
in freebsd-ia64, from Marcel Moolenaar, in 2004-07, to someone
seeing symptoms like mine. They're meant to be typical healthy
numbers. Mine above look somewhat worse, but not insanely so;
surely not enough to explain the difference between using
0.3% cpu and using 75%. Marcel also had HZ=100.

  | % sysctl debug | grep to_avg
  | debug.to_avg_depth: 2500
  | debug.to_avg_gcalls: 1003
  | debug.to_avg_mpcalls: 1255

I would be grateful for any insight into what's going wrong and how
(if at all) it can be fixed or worked around. I'm not subscribed to
-hackers, so would prefer to be cc'ed, but I'll check the web archives
and should therefore see any responses that go only to the list.

I can diddle fairly freely with the machine if that proves useful,
and don't mind tweaking internals if it's diagnostically helpful,
but I'm not a FreeBSD kernel expert and may therefore need some
help identifying what to tweak and how to interpret the results.

Thanks in advance!

-- 
Gareth McCaughan




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200607181317.33416.gmccaughan>