Date: Thu, 30 Oct 2008 17:57:45 -0700 From: Jeremy Chadwick <koitsu@FreeBSD.org> To: Brendan Hart <brendanh@strategicecommerce.com.au> Cc: freebsd-questions@freebsd.org Subject: Re: Large discrepancy in reported disk usage on USR partition Message-ID: <20081031005745.GA18319@icarus.home.lan> In-Reply-To: <038b01c93af1$f0b84fb0$d228ef10$@com.au> References: <021f01c93a28$651752e0$2f45f8a0$@com.au> <20081030011949.GA91409@icarus.home.lan> <022601c93a30$b283e7c0$178bb740$@com.au> <20081030015517.GA92091@icarus.home.lan> <022a01c93a40$6f16e860$4d44b920$@com.au> <20081030040754.GA94642@icarus.home.lan> <038b01c93af1$f0b84fb0$d228ef10$@com.au>
next in thread | previous in thread | raw e-mail | index | archive | help
On Fri, Oct 31, 2008 at 11:15:15AM +1030, Brendan Hart wrote: > > What you showed tells me nothing about SMART, other than the remote possibility > > its basing some of its decisions on the "general SMART health status", > > which means jack squat. I can explain why this is if need be, but it's > > not related to the problem you're having. > > Thanks for this additional information. I hadn't understood that there was > far more information behind the simple SMART ok/not ok reported by the PERC > controller. Here's an example of some attributes: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 1 Raw_Read_Error_Rate 0x000f 200 200 051 Pre-fail Always - 0 3 Spin_Up_Time 0x0003 178 175 021 Pre-fail Always - 6066 4 Start_Stop_Count 0x0032 100 100 000 Old_age Always - 50 5 Reallocated_Sector_Ct 0x0033 200 200 140 Pre-fail Always - 0 7 Seek_Error_Rate 0x000e 200 200 051 Old_age Always - 0 9 Power_On_Hours 0x0032 085 085 000 Old_age Always - 11429 10 Spin_Retry_Count 0x0012 100 253 051 Old_age Always - 0 11 Calibration_Retry_Count 0x0012 100 253 051 Old_age Always - 0 12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 48 192 Power-Off_Retract_Count 0x0032 200 200 000 Old_age Always - 33 193 Load_Cycle_Count 0x0032 200 200 000 Old_age Always - 50 194 Temperature_Celsius 0x0022 117 100 000 Old_age Always - 33 196 Reallocated_Event_Count 0x0032 200 200 000 Old_age Always - 0 197 Current_Pending_Sector 0x0012 200 200 000 Old_age Always - 0 198 Offline_Uncorrectable 0x0010 200 200 000 Old_age Offline - 0 199 UDMA_CRC_Error_Count 0x003e 200 200 000 Old_age Always - 0 200 Multi_Zone_Error_Rate 0x0008 200 200 051 Old_age Offline - 0 You probably now understand why having access to this information is useful. :-) It's very disappointing that so many RAID controllers don't provide a way to get at this information; the ones which do I am very thankful for! > > Either way, this is just one of many reasons to avoid hardware RAID > controllers if given the choice. > > I have seen some mentions of using gvinum and/or gmirror to achieve the > goals of protection from Single Point Of Failure with a single disk, which I > believe is the reason that most people, myself included, have specified > Hardware RAID in their servers. Is this what you mean by avoiding Hardware > Raid? More or less. Hardware RAID has some advantages (I can dig up a mail of mine long ago outlining what the advantages were), but a lot of the time the controller acts as more of a hindrance than a benefit. I personally feel the negatives outweigh the positives, but each person has different needs and requirements. There are some controllers which work very well and provide great degrees of insights (at a disk level) under FreeBSD, and those are often what I recommend if someone wants to go that route. I make it sound like I'm the authoritative voice for what a person should or should not buy -- I'm not. I predominantly rely on Intel ICHx on-board controllers with SATA disks, because ICHx works quite well under FreeBSD (especially with AHCI). I personally have no experience with gmirror or gvinum, but I do have experience with ZFS. (I'll have a little more experience with gmirror once I have the time to test some reported problems with gmirror and high interrupt counts when a disk is hot-swapped). > > I hope these are SCSI disks you're showing here, otherwise I'm not sure how the > > controller is able to get the primary defect count of a SATA or SAS disk. So, > > assuming the numbers shown are accurate, then yes, I don't think there's any > > disk-level problem. > > Yes, they are SCSI disks. Not particularly relevant to this topic, but > interesting: I would have thought that SAS would make the same information > available as SCSI does, as it is a serial bus evolution of SCSI. Is this > thinking incorrect? I don't have any experience with SAS, so I can't comment on what features are available on SAS. Specifically with regards to SMART: historically, SCSI does not provide the amount of granularity/detail with attributes as ATA/SATA does. I do not consider this a negative against SCSI (in case, I very much like SCSI). SAS might provide these details, but I don't know, as I don't have any SAS disks. -- | Jeremy Chadwick jdc at parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, USA | | Making life hard for others since 1977. PGP: 4BD6C0CB |
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20081031005745.GA18319>