Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 24 Apr 2001 15:23:54 -0400 (EDT)
From:      Jaime Kikpole <jkikpole@cairodurham.org>
To:        <freebsd-stable@freebsd.org>
Subject:   Random (?) kernel panics
Message-ID:  <20010424152123.H13315-100000@zeus>

next in thread | raw e-mail | index | archive | help
	I sent the following email to freebsd-questions.  Someone there
suggested that this list was the place for stuff like this.  Any advice is
appreciated.

							Thanks in advance,
							Jaime Kikpole

---------- Forwarded message ----------
Date: Tue, 24 Apr 2001 09:33:59 -0400 (EDT)
From: Jaime Kikpole <jkikpole@cairodurham.org>
To: freebsd-questions@freebsd.org
Subject: Kernel panics

	For the third consecutive time, my 4.2-Stable system has kernel
panic-ed while the system was conducting massive disk-to-disk copying.
The copying is conducted by the following script.  It is
/etc/periodic/weekly/920.backups.

#!/bin/sh
#
echo "Backing up files:"
/usr/bin/uptime
echo " moving last week's tarballs "
/bin/mv /backups/1/var.tar.gz /backups/2/var.tar.gz
/bin/mv /backups/1/web.tar.gz /backups/2/web.tar.gz
echo " making this week's tarballs "
/usr/bin/tar czpf /backups/1/var.tar.gz -C /var .
/usr/bin/tar czpf /backups/1/web.tar.gz -C /web .

echo " moving last week's /home "
/bin/rm -R /backups/2/home
/usr/bin/tar -cpf - -C /backups/1 ./home | /usr/bin/tar -xpf - -C /backups/2
/bin/chmod -RP ugo-w /backups/2
echo " backing up /home "
/bin/rm -R /backups/1/home
/usr/bin/tar -cpf - -C / ./home | /usr/bin/tar -xpf - -C /backups/1
/bin/chmod -RP ugo-w /backups/1
/usr/bin/uptime

	It seems to get panics during the tar cpf | tar xpf command near
the end.  The following is the most recent kernel panic data:

zeus# cd /var/crash
zeus# ls
bounds          kernel.2        minfree         vmcore.2
zeus# gdb -k kernel.2 vmcore.2
GNU gdb 4.18
Copyright 1998 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "i386-unknown-freebsd"...
(no debugging symbols found)...
IdlePTD 3383296
initial pcb at 2af4e0
panicstr: page fault
panic messages:
---
Fatal trap 12: page fault while in kernel mode
fault virtual address   = 0x0
fault code              = supervisor write, page not present
instruction pointer     = 0x8:0xc0204a8c
stack pointer           = 0x10:0xcb889f00
frame pointer           = 0x10:0xcb889f08
code segment            = base 0x0, limit 0xfffff, type 0x1b
                        = DPL 0, pres 1, def32 1, gran 1
processor eflags        = interrupt enabled, resume, IOPL = 0
current process         = 2 (pagedaemon)
interrupt mask          = net bio cam
trap number             = 12
panic: page fault

syncing disks... 492 492 492 492 492 492 492 492 492 492 492 492 492 492
492 492 492 492 492 492
giving up on 492 buffers
Uptime: 5d16h17m2s

dumping to dev #ad/0x20001, offset 524288
dump ata0: resetting devices .. done
256 255 254 253 252 251 250 249 248 247 246 245 244 243 242 241 240 239
238 237 236 235 234 233 232 231 230 229 228 227 226 225 224 223 222 221
220 219 218 217 216 215 214 213 212 211 210 209 208 207 206 205 204 203
202 201 200 199 198 197 196 195 194 193 192 191 190 189 188 187 186 185
184 183 182 181 180 179 178 177 176 175 174 173 172 171 170 169 168 167
166 165 164 163 162 161 160 159 158 157 156 155 154 153 152 151 150 149
148 147 146 145 144 143 142 141 140 139 138 137 136 135 134 133 132 131
130 129 128 127 126 125 124 123 122 121 120 119 118 117 116 115 114 113
112 111 110 109 108 107 106 105 104 103 102 101 100 99 98 97 96 95 94 93
92 91 90 89 88 87 86 85 84 83 82 81 80 79 78 77 76 75 74 73 72 71 70 69 68
67 66 65 64 63 62 61 60 59 58 57 56 55 54 53 52 51 50 49 48 47 46 45 44 43
42 41 40 39 38 37 36 35 34 33 32 31 30 29 28 27 26 25 24 23 22 21 20 19 18
17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1
---
#0  0xc0152ad2 in dumpsys ()


	The server has a vinum RAID-5 array (4 SCSI 18GB disks) for /home.
It has two 60GB EIDE hard drives for /backups/1 and /backups/2.  This
procedure has worked for a few months (3 to 6, I can't recall) with close
to no problems.  The system kernel panic-ed twice, IIRC, while it was
running 4.1-Release or 4.1.1-Stable.  Since using cvsup and make-world to
update it to 4.2-Stable, it has been fine.  Until last weekend, that
is.....

	Since writing the above (it was never received by the mailing list
due to DNS configuration errors on my SMTP relay) I have used the BIOS RAM
test and gotten no warnings.  I also pkg_delete-ed everything listed in
/var/db/pkg, cvsup-ed, did a make world, did a mergemaster, and then
re-compiled (make install) all of the ports that I need.  I then ran the
above script (920.backups) twice for stress testing.  It seemed fine.  I
declared the server good and put it back into service.  On the morning of
the second full day back in service, I get another kernel panic.  This
time, the gdb output looks like this:

zeus:crash>ls
bounds          kernel.3        vmcore.2
kernel.2        minfree         vmcore.3
zeus:crash>pwd
/var/crash
zeus:crash>gdb -k kernel.3 vmcore.3
GNU gdb 4.18
Copyright 1998 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you
are welcome to change it and/or distribute copies of it under certain
conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for
details.
This GDB was configured as "i386-unknown-freebsd"...
(no debugging symbols found)...
IdlePTD 3420160
initial pcb at 2b84e0
panicstr: page fault
panic messages:
---
Fatal trap 12: page fault while in kernel mode
fault virtual address   = 0x20200
fault code              = supervisor write, page not present
instruction pointer     = 0x8:0xc017cc8d
stack pointer           = 0x10:0xcd439cc4
frame pointer           = 0x10:0xcd439cd4
code segment            = base 0x0, limit 0xfffff, type 0x1b
                        = DPL 0, pres 1, def32 1, gran 1
processor eflags        = interrupt enabled, resume, IOPL = 0
current process         = 26781 (afpd)
interrupt mask          = bio
trap number             = 12
panic: page fault

syncing disks...

Fatal trap 12: page fault while in kernel mode
fault virtual address   = 0x2041838
fault code              = supervisor read, page not present
instruction pointer     = 0x8:0xc017d197
stack pointer           = 0x10:0xcd439a54
frame pointer           = 0x10:0xcd439a74
code segment            = base 0x0, limit 0xfffff, type 0x1b
                        = DPL 0, pres 1, def32 1, gran 1
processor eflags        = interrupt enabled, resume, IOPL = 0
current process         = 26781 (afpd)
interrupt mask          = bio
trap number             = 12
panic: page fault

syncing disks...

Fatal trap 12: page fault while in kernel mode
fault virtual address   = 0x2041838
fault code              = supervisor read, page not present
instruction pointer     = 0x8:0xc017d197
stack pointer           = 0x10:0xcd439a54
frame pointer           = 0x10:0xcd439a74
code segment            = base 0x0, limit 0xfffff, type 0x1b
                        = DPL 0, pres 1, def32 1, gran 1
processor eflags        = interrupt enabled, resume, IOPL = 0
current process         = 26781 (afpd)
interrupt mask          = bio
trap number             = 12
panic: page fault
Uptime: 1d0h34m2s

dumping to dev #ad/0x20001, offset 524288
dump ata0: resetting devices .. done
256 255 254 253 252 251 250 249 248 247 246 245 244 243 242 241 240 239
238 237 236 235 234 233 232 231 230 229 228 227 226 225 224 223 222 221
220 219 218 217 216 215 214 213 212 211 210 209 208 207 206 205 204 203
202 201 200 199 198 197 196 195 194 193 192 191 190 189 188 187 186 185
184 183 182 181 180 179 178 177 176 175 174 173 172 171 170 169 168 167
166 165 164 163 162 161 160 159 158 157 156 155 154 153 152 151 150 149
148 147 146 145 144 143 142 141 140 139 138 137 136 135 134 133 132 131
130 129 128 127 126 125 124 123 122 121 120 119 118 117 116 115 114 113
112 111 110 109 108 107 106 105 104 103 102 101 100 99 98 97 96 95 94 93
92 91 90 89 88 87 86 85 84 83 82 81 80 79 78 77 76 75 74 73 72 71 70 69 68
67 66 65 64 63 62 61 60 59 58 57 56 55 54 53 52 51 50 49 48 47 46 45 44 43
42 41 40 39 38 37 36 35 34 33 32 31 30 29 28 27 26 25 24 23 22 21 20 19 18
17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1
---
#0  0xc0157bb6 in dumpsys ()
(kgdb)


	Any advise, suggestions, background info, analysis, etc. is
welcome.  I would be happy to answer questions about this set up.  If it
is possible that I can contribute data (like kernel.? and vmcore.? files)
to help anyone fix bugs, I'd be happy to do so.  Remember, the afpd file
server was sharing out a vinum RAID-5 array, so there may be some helpful
information in those kernel dump files.

						Many thanks in advance,
						Jaime

-- 
Network Administrator
Cairo-Durham Central School District





To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-questions" in the body of the message


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-stable" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20010424152123.H13315-100000>