Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 9 Jan 2011 16:41:43 +0100
From:      Tom Vijlbrief <tom.vijlbrief@xs4all.nl>
To:        Jeremy Chadwick <freebsd@jdc.parodius.com>
Cc:        freebsd-stable@freebsd.org
Subject:   Re: Panic 8.2 PRERELEASE WRITE_DMA48
Message-ID:  <AANLkTin3FHcsdMtA9OYaA2wrUx%2BfpyEsTThdRmS8sXA5@mail.gmail.com>
In-Reply-To: <20110109122243.GA37530@icarus.home.lan>
References:  <AANLkTi=iaq1Lx521oUF2BSB4-2wi9Ys2fTLzz4kLaLVo@mail.gmail.com> <20110109122243.GA37530@icarus.home.lan>

next in thread | previous in thread | raw e-mail | index | archive | help
I've run many fscks on /usr in single user because I had soft update
inconsistencies,
no DMA errors during those repairs.

smartctl 5.40 2010-10-16 r3189 [FreeBSD 8.2-PRERELEASE i386] (local build)
Copyright (C) 2002-10 by Bruce Allen, http://smartmontools.sourceforge.net

=3D=3D=3D START OF INFORMATION SECTION =3D=3D=3D
Model Family:     SAMSUNG SpinPoint F1 DT series
Device Model:     SAMSUNG HD103UJ
Serial Number:    S13PJ9BQC02902
Firmware Version: 1AA01113
User Capacity:    1,000,204,886,016 bytes
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   8
ATA Standard is:  ATA-8-ACS revision 3b
Local Time is:    Sun Jan  9 16:40:24 2011 CET
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=3D=3D=3D START OF READ SMART DATA SECTION =3D=3D=3D
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x00) Offline data collection activity
                                        was never started.
                                        Auto Offline Data Collection: Disab=
led.
Self-test execution status:      (   0) The previous self-test routine comp=
leted
                                        without error or no self-test has e=
ver
                                        been run.
Total time to complete Offline
data collection:                 (11811) seconds.
Offline data collection
capabilities:                    (0x7b) SMART execute Offline immediate.
                                        Auto Offline data collection
on/off support.
                                        Suspend Offline collection upon new
                                        command.
                                        Offline surface scan supported.
                                        Self-test supported.
                                        Conveyance Self-test supported.
                                        Selective Self-test supported.
SMART capabilities:            (0x0003) Saves SMART data before entering
                                        power-saving mode.
                                        Supports SMART auto save timer.
Error logging capability:        (0x01) Error logging supported.
                                        General Purpose Logging supported.
Short self-test routine
recommended polling time:        (   2) minutes.
Extended self-test routine
recommended polling time:        ( 198) minutes.
Conveyance self-test routine
recommended polling time:        (  21) minutes.
SCT capabilities:              (0x003f) SCT Status supported.
                                        SCT Error Recovery Control supporte=
d.
                                        SCT Feature Control supported.
                                        SCT Data Table supported.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE
UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x000f   100   100   051    Pre-fail
Always       -       0
  3 Spin_Up_Time            0x0007   078   078   011    Pre-fail
Always       -       7580
  4 Start_Stop_Count        0x0032   100   100   000    Old_age
Always       -       399
  5 Reallocated_Sector_Ct   0x0033   100   100   010    Pre-fail
Always       -       0
  7 Seek_Error_Rate         0x000f   253   253   051    Pre-fail
Always       -       0
  8 Seek_Time_Performance   0x0025   100   100   015    Pre-fail
Offline      -       10097
  9 Power_On_Hours          0x0032   100   100   000    Old_age
Always       -       2375
 10 Spin_Retry_Count        0x0033   100   100   051    Pre-fail
Always       -       0
 11 Calibration_Retry_Count 0x0012   100   100   000    Old_age
Always       -       0
 12 Power_Cycle_Count       0x0032   100   100   000    Old_age
Always       -       392
 13 Read_Soft_Error_Rate    0x000e   100   100   000    Old_age
Always       -       0
183 Runtime_Bad_Block       0x0032   100   100   000    Old_age
Always       -       0
184 End-to-End_Error        0x0033   100   100   000    Pre-fail
Always       -       0
187 Reported_Uncorrect      0x0032   100   100   000    Old_age
Always       -       0
188 Command_Timeout         0x0032   100   100   000    Old_age
Always       -       0
190 Airflow_Temperature_Cel 0x0022   057   052   000    Old_age
Always       -       43 (Min/Max 42/45)
194 Temperature_Celsius     0x0022   056   050   000    Old_age
Always       -       44 (Min/Max 42/46)
195 Hardware_ECC_Recovered  0x001a   100   100   000    Old_age
Always       -       20728126
196 Reallocated_Event_Count 0x0032   100   100   000    Old_age
Always       -       0
197 Current_Pending_Sector  0x0012   100   100   000    Old_age
Always       -       0
198 Offline_Uncorrectable   0x0030   100   100   000    Old_age
Offline      -       0
199 UDMA_CRC_Error_Count    0x003e   100   100   000    Old_age
Always       -       1
200 Multi_Zone_Error_Rate   0x000a   100   100   000    Old_age
Always       -       0
201 Soft_Read_Error_Rate    0x000a   100   100   000    Old_age
Always       -       0

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining
LifeTime(hours)  LBA_of_first_error
# 1  Short offline       Completed without error       00%      2361       =
  -
# 2  Short offline       Completed without error       00%      2205       =
  -
# 3  Short offline       Completed without error       00%      2138       =
  -
# 4  Extended offline    Completed without error       00%      2109       =
  -
# 5  Short offline       Completed without error       00%      2105       =
  -
# 6  Short offline       Completed without error       00%      2092       =
  -
# 7  Short offline       Completed without error       00%      2083       =
  -
# 8  Short offline       Completed without error       00%      2057       =
  -
# 9  Extended offline    Completed without error       00%      2037       =
  -
#10  Short offline       Completed without error       00%      2033       =
  -
#11  Short offline       Completed without error       00%      2009       =
  -
#12  Short offline       Completed without error       00%      1974       =
  -
#13  Short offline       Completed without error       00%      1941       =
  -
#14  Extended offline    Completed without error       00%      1920       =
  -
#15  Short offline       Completed without error       00%      1916       =
  -
#16  Short offline       Completed without error       00%      1868       =
  -
#17  Short offline       Completed without error       00%      1810       =
  -
#18  Short offline       Completed without error       00%      1655       =
  -
#19  Short offline       Completed without error       00%      1638       =
  -
#20  Extended offline    Completed without error       00%      1596       =
  -
#21  Short offline       Completed without error       00%      1591       =
  -

SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.


dmesg was in the attachment of the original mail but I'll paste it here:

Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
        The Regents of the University of California. All rights reserved.
FreeBSD is a registered trademark of The FreeBSD Foundation.
FreeBSD 8.2-PRERELEASE #0: Thu Dec 30 12:21:06 CET 2010
    root@swanbsd.v7f.eu:/usr/obj/usr/src/sys/GENERIC i386
Timecounter "i8254" frequency 1193182 Hz quality 0
CPU: Intel(R) Pentium(R) 4 CPU 2.53GHz (2558.54-MHz 686-class CPU)
  Origin =3D "GenuineIntel"  Id =3D 0xf24  Family =3D f  Model =3D 2  Stepp=
ing =3D 4
  Features=3D0x3febfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PG=
E,MCA,CMOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM>
real memory  =3D 1610612736 (1536 MB)
avail memory =3D 1555316736 (1483 MB)
ACPI APIC Table: <ASUS   P4B533  >
ioapic0 <Version 2.0> irqs 0-23 on motherboard
kbd1 at kbdmux0
acpi0: <ASUS P4B533> on motherboard
acpi0: [ITHREAD]
acpi0: Power Button (fixed)
acpi0: reservation of 0, a0000 (3) failed
acpi0: reservation of 100000, 5ff00000 (3) failed
Timecounter "ACPI-fast" frequency 3579545 Hz quality 1000
acpi_timer0: <24-bit timer at 3.579545MHz> port 0xe408-0xe40b on acpi0
cpu0: <ACPI CPU> on acpi0
acpi_button0: <Power Button> on acpi0
pcib0: <ACPI Host-PCI bridge> port 0xcf8-0xcff on acpi0
pci0: <ACPI PCI bus> on pcib0
pcib1: <ACPI PCI-PCI bridge> at device 1.0 on pci0
pci1: <ACPI PCI bus> on pcib1
vgapci0: <VGA-compatible display> mem
0xf2000000-0xf2ffffff,0xf4000000-0xf7ffffff,0xf3800000-0xf387ffff irq
16 at device 0.0 on pci1
nvidia0: <GeForce4 Ti 4200> on vgapci0
vgapci0: child nvidia0 requested pci_enable_busmaster
vgapci0: child nvidia0 requested pci_enable_io
nvidia0: [GIANT-LOCKED]
nvidia0: [ITHREAD]
uhci0: <Intel 82801DB (ICH4) USB controller USB-A> port 0xd800-0xd81f
irq 16 at device 29.0 on pci0
uhci0: [ITHREAD]
uhci0: LegSup =3D 0x2f00
usbus0: <Intel 82801DB (ICH4) USB controller USB-A> on uhci0
uhci1: <Intel 82801DB (ICH4) USB controller USB-B> port 0xd400-0xd41f
irq 19 at device 29.1 on pci0
uhci1: [ITHREAD]
uhci1: LegSup =3D 0x2f00
usbus1: <Intel 82801DB (ICH4) USB controller USB-B> on uhci1
uhci2: <Intel 82801DB (ICH4) USB controller USB-C> port 0xd000-0xd01f
irq 18 at device 29.2 on pci0
uhci2: [ITHREAD]
uhci2: LegSup =3D 0x2f00
usbus2: <Intel 82801DB (ICH4) USB controller USB-C> on uhci2
ehci0: <Intel 82801DB/L/M (ICH4) USB 2.0 controller> mem
0xf1800000-0xf18003ff irq 23 at device 29.7 on pci0
ehci0: [ITHREAD]
usbus3: EHCI version 1.0
usbus3: <Intel 82801DB/L/M (ICH4) USB 2.0 controller> on ehci0
pcib2: <ACPI PCI-PCI bridge> at device 30.0 on pci0
pci2: <ACPI PCI bus> on pcib2
rl0: <RealTek 8139 10/100BaseTX> port 0xb800-0xb8ff mem
0xf1000000-0xf10000ff irq 22 at device 10.0 on pci2
miibus0: <MII bus> on rl0
rlphy0: <RealTek internal media interface> PHY 0 on miibus0
rlphy0:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto
rl0: Ethernet address: 00:50:bf:e1:3b:35
rl0: [ITHREAD]
atapci0: <SiI 3512 SATA150 controller> port
0xb400-0xb407,0xb000-0xb003,0xa800-0xa807,0xa400-0xa403,0xa000-0xa00f
mem 0xf0800000-0xf08001ff irq 23 at device 11.0 on pci2
atapci0: [ITHREAD]
ata2: <ATA channel 0> on atapci0
ata2: [ITHREAD]
ata3: <ATA channel 1> on atapci0
ata3: [ITHREAD]
pcm0: <Creative EMU10K1> port 0x9800-0x981f irq 20 at device 12.0 on pci2
pcm0: <SigmaTel STAC9708/11 AC97 Codec>
pcm0: [ITHREAD]
isab0: <PCI-ISA bridge> at device 31.0 on pci0
isa0: <ISA bus> on isab0
atapci1: <Intel ICH4 UDMA100 controller> port
0x1f0-0x1f7,0x3f6,0x170-0x177,0x376,0xf000-0xf00f irq 18 at device
31.1 on pci0
ata0: <ATA channel 0> on atapci1
ata0: [ITHREAD]
ata1: <ATA channel 1> on atapci1
ata1: [ITHREAD]
atrtc0: <AT realtime clock> port 0x70-0x73 irq 8 on acpi0
fdc0: <floppy drive controller> port 0x3f2-0x3f5,0x3f7 irq 6 drq 2 on acpi0
fdc0: [FILTER]
fd0: <1440-KB 3.5" drive> on fdc0 drive 0
ppc0: <Parallel port> port 0x378-0x37f,0x778-0x77b irq 7 drq 3 on acpi0
ppc0: SMC-like chipset (ECP/EPP/PS2/NIBBLE) in COMPATIBLE mode
ppc0: FIFO with 16/16/16 bytes threshold
ppc0: [ITHREAD]
ppbus0: <Parallel port bus> on ppc0
plip0: <PLIP network interface> on ppbus0
plip0: [ITHREAD]
lpt0: <Printer> on ppbus0
lpt0: [ITHREAD]
lpt0: Interrupt-driven port
ppi0: <Parallel I/O> on ppbus0
uart0: <16550 or compatible> port 0x3f8-0x3ff irq 4 flags 0x10 on acpi0
uart0: [FILTER]
uart0: console (9600,n,8,1)
uart1: <16550 or compatible> port 0x2f8-0x2ff irq 3 on acpi0
uart1: [FILTER]
pmtimer0 on isa0
orm0: <ISA Option ROM> at iomem 0xd0000-0xd47ff pnpid ORM0000 on isa0
sc0: <System console> at flags 0x100 on isa0
sc0: VGA <16 virtual consoles, flags=3D0x100>
vga0: <Generic ISA VGA> at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0
atkbdc0: <Keyboard controller (i8042)> at port 0x60,0x64 on isa0
atkbd0: <AT Keyboard> irq 1 on atkbdc0
kbd0 at atkbd0
atkbd0: [GIANT-LOCKED]
atkbd0: [ITHREAD]
p4tcc0: <CPU Frequency Thermal Control> on cpu0
Timecounter "TSC" frequency 2558535720 Hz quality 800
Timecounters tick every 1.000 msec
usbus0: 12Mbps Full Speed USB v1.0
usbus1: 12Mbps Full Speed USB v1.0
usbus2: 12Mbps Full Speed USB v1.0
usbus3: 480Mbps High Speed USB v2.0
ugen0.1: <Intel> at usbus0
uhub0: <Intel UHCI root HUB, class 9/0, rev 1.00/1.00, addr 1> on usbus0
ugen1.1: <Intel> at usbus1
uhub1: <Intel UHCI root HUB, class 9/0, rev 1.00/1.00, addr 1> on usbus1
ugen2.1: <Intel> at usbus2
uhub2: <Intel UHCI root HUB, class 9/0, rev 1.00/1.00, addr 1> on usbus2
ugen3.1: <Intel> at usbus3
uhub3: <Intel EHCI root HUB, class 9/0, rev 2.00/1.00, addr 1> on usbus3
ad0: 76319MB <WDC WD800BB-00CAA1 17.07W17> at ata0-master UDMA100
acd0: DVDROM <HL-DT-STDVD-ROM GDR8161B/0100> at ata0-slave UDMA33
ata1: DMA limited to UDMA33, controller found non-ATA66 cable
acd1: DVDR <HL-DT-STDVD-RAM GH22NP20/1.02> at ata1-master UDMA33
ad4: 953869MB <SAMSUNG HD103UJ 1AA01113> at ata2-master UDMA100 SATA 1.5Gb/=
s
uhub0: 2 ports with 2 removable, self powered
uhub1: 2 ports with 2 removable, self powered
uhub2: 2 ports with 2 removable, self powered
Root mount waiting for: usbus3
Root mount waiting for: usbus3
uhub3: 6 ports with 6 removable, self powered


2011/1/9 Jeremy Chadwick <freebsd@jdc.parodius.com>:
> On Sun, Jan 09, 2011 at 12:33:10PM +0100, Tom Vijlbrief wrote:
>> The last half year I've been installing FreeBSD on several machines.
>>
>> I installed it on my main desktop system a few weeks ago which
>> normally runs Linux, but I get this panic under heavy disk I/O.
>>
>> It even happened during the initial sysinstall, allthough I also have
>> completed several buildworlds without problems.
>>
>> I can trigger it easily by accessing /usr (UFS) and a linux ext
>> partition simultaneously, eg by copying
>> large files to the /usr partition.
>>
>> Just bought a serial cable to enable the serial console of the various
>> FreeBSD installations, which is of good use for this problem, because
>> a crash dump is not written.
>>
>> Full boot output in the attachment
>>
>> Sun Jan =A09 10:11:17 CET 2011
>> unknown: TIMEOUT - WRITE_DMA48 retrying (1 retry left) LBA=3D274799820^M
>> ata2: timeout waiting to issue command^M
>> ata2: error issuing WRITE_DMA48 command^M
>> g_vfs_done():ad4s2f[WRITE(offset=3D28915105792, length=3D131072)]error =
=3D 6^M
>> /usr: got error 6 while accessing filesystem^M
>> panic: softdep_deallocate_dependencies: unrecovered I/O error^M
>> cpuid =3D 0^M
>> KDB: stack backtrace:^M
>> #0 0xc08e0f77 at kdb_backtrace+0x47^M
>> #1 0xc08b2037 at panic+0x117^M
>> #2 0xc0ae2ecd at softdep_deallocate_dependencies+0x3d^M
>> #3 0xc0925590 at brelse+0x90^M
>> #4 0xc092829a at bufdone_finish+0x3fa^M
>> #5 0xc092830d at bufdone+0x4d^M
>> #6 0xc092bdf9 at cluster_callback+0x89^M
>> #7 0xc09282f7 at bufdone+0x37^M
>> #8 0xc0850ad5 at g_vfs_done+0x85^M
>> #9 0xc09224d9 at biodone+0xb9^M
>> #10 0xc084da69 at g_io_schedule_up+0x79^M
>> #11 0xc084e0a8 at g_up_procbody+0x68^M
>> #12 0xc0886fc1 at fork_exit+0x91^M
>> #13 0xc0bcc144 at fork_trampoline+0x8^M
>> Uptime: 2h56m27s^M
>> Physical memory: 1515 MB^M
>> Dumping 177 MB:ata2: timeout waiting to issue command^M
>> ata2: error issuing WRITE_DMA command^M
>> ^M
>> ** DUMP FAILED (ERROR 5) **^M
>> Automatic reboot in 15 seconds - press a key on the console to abort^M
>> Rebooting...^M
>
> Can you please provide output from the following commands (after
> installing ports/sysutils/smartmontools, which should be version 5.40 or
> later (in case you haven't updated your ports tree)):
>
> $ dmesg
> $ smartctl -a /dev/ad4
>
> The SMART output should act as a verifier as to whether or not you
> really do have a bad block on your disk (which is what READ/WRITE_DMA48
> can sometimes indicate).
>
> You may also want to boot the machine in single user mode and do a
> manual "fsck /dev/ad4s2f". =A0It's been proven in the past that
> background_fsck doesn't manage to address all issues.
>
> --
> | Jeremy Chadwick =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0=
 =A0 =A0 =A0 jdc@parodius.com |
> | Parodius Networking =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 http://=
www.parodius.com/ |
> | UNIX Systems Administrator =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0Mountain =
View, CA, USA |
> | Making life hard for others since 1977. =A0 =A0 =A0 =A0 =A0 =A0 =A0 PGP=
 4BD6C0CB |
>
>



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?AANLkTin3FHcsdMtA9OYaA2wrUx%2BfpyEsTThdRmS8sXA5>