Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 15 Mar 2005 08:22:51 -0500
From:      Michael Boers <msb@datacompusa.com>
To:        Pawel Jakub Dawidek <pjd@FreeBSD.org>
Cc:        freebsd-geom@FreeBSD.org
Subject:   Re: gmirrored boot drives locks up during buildworld
Message-ID:  <42cc591ca3faf6a5ab49fa7101e5945c@datacompusa.com>
In-Reply-To: <20050314190650.GX9291@darkness.comp.waw.pl>
References:  <f7944513206a01f926630fed59abc188@datacompusa.com> <20050314190650.GX9291@darkness.comp.waw.pl>

next in thread | previous in thread | raw e-mail | index | archive | help
Pawel,

Thank-you for your quick response.

In addition to enabling HTT in the bios, I have made the following 
changes to the kernel,

options	SMP
options	MPTABLE_FORCE_HTT
options	MP_WATCHDOG

and set the following flags

kern.geom.mirror.debug=1
debug.watchdog=1

I reran the "while (true) do make clean; make buildworld; done" test 
and the machine has locked up as before.

I was monitoring the test with a terminal running top

last pid: 57530;  load averages:  3.79,  3.24,  2.93                    
                                 up 0+07:18:14  23:31:10
148 processes: 6 running, 111 sleeping, 31 waiting
CPU states: 31.3% user,  0.0% nice, 18.5% system,  0.2% interrupt, 
50.0% idle
Mem: 110M Active, 385M Inact, 89M Wired, 672K Cache, 112M Buf, 416M Free
Swap: 512M Total, 512M Free

   PID USERNAME PRI NICE   SIZE    RES STATE  C   TIME   WCPU    CPU 
COMMAND
    11 root     130    0     0K    12K CPU1   1 383:33 100.54% 100.54% 
mp_watchdog cpu 1
     3 root      -8    0     0K    12K -      0  21:40 29.74% 29.74% g_up
    29 root     -64 -183     0K    12K WAIT   0   2:42  5.22%  5.22% 
irq18: uhci2+
     4 root      -8    0     0K    12K -      0   3:27  2.05%  2.05% 
g_down
    12 root      96    0     0K    12K RUN    0 383:56  0.00%  0.00% 
idle: cpu0
   576 mysql     20    0 56896K 34584K kserel 0   1:17  0.00%  0.00% 
mysqld
   713 root       8    0  2000K  1456K nanslp 0   0:19  0.00%  0.00% 
gstat
    25 root     -64 -183     0K    12K WAIT   0   0:19  0.00%  0.00% 
irq14: ata0
    50 root      -8    0     0K    12K m:w1   0   0:18  0.00%  0.00% 
g_mirror boot
    37 root     -28 -147     0K    12K WAIT   0   0:15  0.00%  0.00% 
swi5: clock sio
    53 root     171   52     0K    12K RUN    0   0:14  0.00%  0.00% 
pagezero
   680 root      96    0  2440K  1728K CPU0   0   0:13  0.00%  0.00% top
   579 mysql     20    0   132M 43196K kserel 0   0:09  0.00%  0.00% 
mysqld
   577 mysql     20    0 56576K 24688K kserel 0   0:08  0.00%  0.00% 
mysqld
    39 root      76    0     0K    12K -      0   0:06  0.00%  0.00% 
yarrow
   594 msb       96    0  6216K  3020K select 0   0:05  0.00%  0.00% sshd
    56 root      20    0     0K    12K syncer 0   0:03  0.00%  0.00% 
syncer


and a terminal running gstat

dT: 0.510  flag_I 500000us  sizeof 240  i -1
  L(q)  ops/s    r/s   kBps   ms/r    w/s   kBps   ms/w   %busy Name
     0      0      0      0    0.0      0      0    0.0    0.0| ad0
     0      0      0      0    0.0      0      0    0.0    0.0| 
mirror/boot
     0      0      0      0    0.0      0      0    0.0    0.0| ad0a
     0      0      0      0    0.0      0      0    0.0    0.0| ad0b
     0      0      0      0    0.0      0      0    0.0    0.0| ad0c
     0      0      0      0    0.0      0      0    0.0    0.0| ad0d
     0      0      0      0    0.0      0      0    0.0    0.0| ad0e
     0      0      0      0    0.0      0      0    0.0    0.0| ad0f
     0      0      0      0    0.0      0      0    0.0    0.0| 
mirror/boota
     0      0      0      0    0.0      0      0    0.0    0.0| 
mirror/bootb
     0      0      0      0    0.0      0      0    0.0    0.0| 
mirror/bootc
     0      0      0      0    0.0      0      0    0.0    0.0| 
mirror/bootd
     0      0      0      0    0.0      0      0    0.0    0.0| 
mirror/boote
     0      0      0      0    0.0      0      0    0.0    0.0| 
mirror/bootf
     0      0      0      0    0.0      0      0    0.0    0.0| ad1
     0      0      0      0    0.0      0      0    0.0    0.0| acd0
     0      0      0      0    0.0      0      0    0.0    0.0| ad4
     0      0      0      0    0.0      0      0    0.0    0.0| ad6
     0      0      0      0    0.0      0      0    0.0    0.0| ad4c
     0      0      0      0    0.0      0      0    0.0    0.0| ad4d
     0      0      0      0    0.0      0      0    0.0    0.0| ad6c
     0      0      0      0    0.0      0      0    0.0    0.0| ad6d


I have not touched the machine yet.  Is there any other info I can 
provide?

--
Michael Boers
Datacomp

On Mar 14, 2005, at 2:06 PM, Pawel Jakub Dawidek wrote:

> On Mon, Mar 14, 2005 at 01:46:15PM -0500, Michael Boers wrote:
> +> I recently installed FreeBSD 5.3 on a machine to be my primary mysql
> +> server.  The machine failed after about 3 weeks of heavy use.  The
> +> machine did not panic, it just froze and some random characters
> +> appeared on the console.  A reboot restored the system for another 
> few
> +> weeks.  On the third failure I took it out of production.
> +>
> +> The machine consists of a Intel Pentium 4 EE HT with a pair of 80
> +> gigabyte IDE gmirrored boot drives and a pair of 250 gigabyte IDE
> +> gmirror data drives.
> +>
> +> With the machine out of production, I used
> +>
> +> while (true) do make clean; make buildworld; done
> +>
> +> to exercise the machine until it failed.  Usually within three 
> days.  I
> +> swapped video cards, memory, hard drives, and played with bios 
> settings
> +> to no avail.  Finally I determined that when I ran without using
> +> gmirror, the machine would build indefinitely.
> +>
> +> Finally, I tried the buildworld test on a completely different (amd 
> vs
> +> intel, scsi vs ide disks) machine and it failed in less than 3 
> hours.
> +>
> +> Because the system freezes rather than panics, I have no diagnostic
> +> information to provide.
> +>
> +> If this is a possible gmirror bug, please let me know if there is 
> any
> +> other information I can provide.  I am very interested in using 
> gmirror
> +> but I want to make sure it is safe.  Please feel free to call me at 
> the
> +> below number if necessary.
>
> Could increase kern.geom.mirror.debug to 1?
> Could you turn on HTT and compile your kernel with MP_WATCHDOG (you 
> should
> also set debug.watchdog to 1)?
>
> -- 
> Pawel Jakub Dawidek                       http://www.wheel.pl
> pjd@FreeBSD.org                           http://www.FreeBSD.org
> FreeBSD committer                         Am I Evil? Yes, I Am!



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?42cc591ca3faf6a5ab49fa7101e5945c>