From owner-freebsd-hackers@FreeBSD.ORG Mon Aug 24 07:46:04 2009 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 39C57106568B for ; Mon, 24 Aug 2009 07:46:04 +0000 (UTC) (envelope-from to.my.trociny@gmail.com) Received: from mail-fx0-f210.google.com (mail-fx0-f210.google.com [209.85.220.210]) by mx1.freebsd.org (Postfix) with ESMTP id B3B738FC08 for ; Mon, 24 Aug 2009 07:46:03 +0000 (UTC) Received: by fxm6 with SMTP id 6so1234545fxm.43 for ; Mon, 24 Aug 2009 00:46:02 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:to:subject:organization:from :date:message-id:user-agent:mime-version:content-type; bh=bSJcLjO1HDDeyq3AwVwjgGplKB/eBLIqgHmCMTQxNbk=; b=vstv0+4a86/MwhWXv5hGFqJI85u/nkG7zcHNMEcI9vbYeAf4nBcuFGYyUNLAgMOlPx in1yEA7ZRJc3Szq2TnXuHj0Dv+Abx8yZTs8caeJjS3BimFKZJr//lDVx0xiFcIMUcLDD +j6UK8EP4kPosxbWebd8oZtBiP4o36WLailsE= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=to:subject:organization:from:date:message-id:user-agent :mime-version:content-type; b=RUEyBBOV3DxsNwyAkEOpQpokTSITiYW20rjNRMDGy8qOOEpKFoa55em9h/08RreKSN cGKBzGkDTGW2Ef/lycjv7nUJ2A6OI/l7je/mlX+TeajtvAB734kh7KW48rAgr4jFYrKb u9Ffbcfo9rZy5EM4J3a/Yca4B3PQ5/s8nwx30= Received: by 10.103.81.21 with SMTP id i21mr1725863mul.57.1251099962563; Mon, 24 Aug 2009 00:46:02 -0700 (PDT) Received: from localhost ([95.69.167.4]) by mx.google.com with ESMTPS id e9sm20133124muf.2.2009.08.24.00.46.00 (version=TLSv1/SSLv3 cipher=RC4-MD5); Mon, 24 Aug 2009 00:46:01 -0700 (PDT) To: freebsd-hackers@freebsd.org Organization: TOA Ukraine From: Mikolaj Golub Date: Mon, 24 Aug 2009 10:45:58 +0300 Message-ID: <86ws4tejt5.fsf@kopusha.onet> User-Agent: Gnus/5.11 (Gnus v5.11) Emacs/22.1 (berkeley-unix) MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Subject: Partial kvm dumps X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 24 Aug 2009 07:46:04 -0000 Hi, I would like to discuss the idea of partial kvm dumps -- the possibility of creating dumps of some parts of the kernel memory from the live system, which later could be read via KVM interface. Why this could be useful. I suppose many people here happened to set up scripts to run utilies like ps, vmstat, top etc periodically to collect system statistics and analyse system behaviour when problems happened. I did this so often that eventually wrote perl script -- wrapper around these utilities to make the setup and later data analysis easier. http://code.google.com/p/gatherit/wiki/README Currently I run this script on most of my servers collecting various statistics about a system. But I feel some discomfort from the fact that this is rather inefficient. Here is typical list of commands I use to collect data: $ ./gather show utils -------------------------------------------------------------------------------- name cmd desc -------------------------------------------------------------------------------- devstat /usr/local/bin/devstat devstat output df /bin/df df output fstat /usr/bin/fstat fstat output netstat-La /usr/bin/netstat -nLa netstat listening socket statistics netstat-a /usr/bin/netstat -na netstat socket statistics netstat-i /usr/bin/netstat -ni netstat interface statistics netstat-m /usr/bin/netstat -m netstat mbuf statistics netstat-r /usr/bin/netstat -nr netstat routing tables netstat-rs /usr/bin/netstat -rs netstat routing statistics netstat-s /usr/bin/netstat -s netstat system wide statistics nfsstat /usr/bin/nfsstat nfsstat output ps /bin/ps auxww processes statistics (-u flag) ps-l /bin/ps alxww processes statistics (-l flag) sockstat /usr/bin/sockstat sockstat output sysctl /sbin/sysctl -a sysctl variables top /usr/bin/top -d1 -S -b 1000 top output (cpu mode) top-mio /usr/bin/top -d1 -S -mio -b 1000 top output (io mode) uptime /usr/bin/uptime system uptime vmstat /usr/bin/vmstat vmstat output vmstat-i /usr/bin/vmstat -ai vmstat interupts statistics Note, many utilities are run several times but with different parameters, also there are comands that do almost the same (e.g. netstat -a and sockstat), processing the same kernel structures. I want them all to run because I don't know in advance what output will turn out more usefull in certain circumstances. It would be more efficient to have some one utility that whould traverse kernel structures extracting all necessary data and later on need this data would be converted to human readable output. And actually we have almost everything for this to work. Many of the system utilities can output data not only from live system but from core dumps too. So if we created dumps from live systems periodically then later we would use them to extract system statistics. Of course there is a little sense in dumping the whole kernel memory. We could extend our KVM interface to have the possibility of creating and then later reading dumps that would contain only necessary parts of kernel memory. As proof of the concept I have written pkvmdump utility that creates partial dumps with some kernel statistics, which can be later exctracted by vmstat and ps utilities. The details of the current implementation: Generated dump has simple format: dump header (struct minidumphdr is used with PKVMDUMP_MAGIC) and data entries. Each data entry has header (address of extracted data in kvm and its lenth) + data itself. So the generation of a dump is very simple -- kvm_open(3) /dev/mem, read necessary regions of memory and write to dump prepending with [addr, len] header. To read the dump the libkvm interface has been extended. The following trick (hack? :-) is used: On kvm_open(): 1) create temporary (unlinked) file; 2) for every data entry from the dump do in tempfile: lseek(addr, SEEK_SET), write(data, len); 3) close dump file and set kd->pmfd to point to tempfile. On kvm_read() the request is translated to direct read from the tempfile. This format/algorithm has been chosen becase of simplicity of implementation, just to start experimenting with this. You can find the source here: http://code.google.com/p/trociny/downloads/list I would like to hear what other people think about this. It looks very useful for me. At least as a first step it would be nice to extend KVM to work with partial dumps so the users could try this and see if it turned out to be useful. P.S. The final goal I would like to achive is to make snapshots of system state, which could be used for later analysis if necessary. May be the approach I try here is wrong. E.g. SNMP looks like more proper alternative solution -- this is standard, also snmpd is actually that program which "traverse kernel structures extracting all necessary data". But SNMP has its own limitations, statistics provided via SNMP are rather limited and currently I don't see how I could use it effectively to echieve my goal, althogh I haven't think much in this direction yet... -- Mikolaj Golub