From owner-freebsd-stable@FreeBSD.ORG Sat Mar 4 01:23:39 2006 Return-Path: X-Original-To: stable@freebsd.org Delivered-To: freebsd-stable@FreeBSD.ORG Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 1EAFB16A420 for ; Sat, 4 Mar 2006 01:23:39 +0000 (GMT) (envelope-from bc979@lafn.org) Received: from zoot.lafn.org (zoot.lafn.ORG [206.117.18.6]) by mx1.FreeBSD.org (Postfix) with ESMTP id AAACF43D48 for ; Sat, 4 Mar 2006 01:23:38 +0000 (GMT) (envelope-from bc979@lafn.org) Received: from [10.0.1.2] (pool-71-109-244-179.lsanca.dsl-w.verizon.net [71.109.244.179]) (authenticated bits=0) by zoot.lafn.org (8.13.4/8.13.1) with ESMTP id k241NZ5D045927 (version=TLSv1/SSLv3 cipher=RC4-SHA bits=128 verify=NO) for ; Fri, 3 Mar 2006 17:23:37 -0800 (PST) (envelope-from bc979@lafn.org) Mime-Version: 1.0 (Apple Message framework v746.2) Content-Transfer-Encoding: 7bit Message-Id: <2E14449E-E2AA-4102-B162-BFE264985629@lafn.org> Content-Type: text/plain; charset=US-ASCII; delsp=yes; format=flowed To: stable@freebsd.org From: Doug Hardie Date: Fri, 3 Mar 2006 17:23:33 -0800 X-Mailer: Apple Mail (2.746.2) X-Virus-Scanned: ClamAV 0.88/1313/Fri Mar 3 08:19:06 2006 on zoot.lafn.org X-Virus-Status: Clean Cc: Subject: Failed disk sectors X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 04 Mar 2006 01:23:39 -0000 I have a large disk that has several failed sectors. The drive basically is the article storage for news so it has lots of files. Basically the error messages I get during the inn expire operation is there are a couple failed sectors where the drive cannot successfully read the sectors. The LBA is given. The problem is finding out what those LBA's are used for. The drive SMART status show plenty of available spare sectors, but since it can't read those sectors it won't remap them to a spare sector till the next write of that sector. expire basically gives up when it reaches that error. So my first attempt was to run a cksum of all the files on the disk. That actually cought one of the sectors and gave me the file name. I deleted the file and since it was an overview file for one group, I just rebuilt it. There are still more to go though. That process took many hours. I have not found anything in the archives or man pages or ports that addresses identifying the object/file that has that LBA. So I have started looking into the ufs structures to see how that could be done. fdisk source shows how to access the partition data. For the specific disk, fdisk reports a media sector size of 512 and the block count matches that. So I assume I would have to subtract the start of that partition from the LBA. However, that assumes that the LBA is in the same 512 byte block numbering system. I am not convinced that would always be correct. Next has to address the bsdlabel. I am now presuming that the LBA value of 0 is the start of the drive, not the start of the partition. I am not sure if this is correct either. If so, then bsdlabel type code would be required to identify the partition. Then the start of the partition would need to be subtracted from the LBA. At that point I think I have the values that would be found in the block tables in the inodes. Before digging into the inode structures I though it would be a good idea to check my understanding to this point. Am I on the right path?