From owner-freebsd-dtrace@FreeBSD.ORG Thu Oct 17 20:06:59 2013 Return-Path: Delivered-To: freebsd-dtrace@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTP id 86A32CE1 for ; Thu, 17 Oct 2013 20:06:59 +0000 (UTC) (envelope-from jmg@h2.funkthat.com) Received: from h2.funkthat.com (gate2.funkthat.com [208.87.223.18]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id 6B0482EDC for ; Thu, 17 Oct 2013 20:06:59 +0000 (UTC) Received: from h2.funkthat.com (localhost [127.0.0.1]) by h2.funkthat.com (8.14.3/8.14.3) with ESMTP id r9HK6wp7078513 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Thu, 17 Oct 2013 13:06:58 -0700 (PDT) (envelope-from jmg@h2.funkthat.com) Received: (from jmg@localhost) by h2.funkthat.com (8.14.3/8.14.3/Submit) id r9HK6wfn078512 for freebsd-dtrace@FreeBSD.org; Thu, 17 Oct 2013 13:06:58 -0700 (PDT) (envelope-from jmg) Date: Thu, 17 Oct 2013 13:06:58 -0700 From: John-Mark Gurney To: freebsd-dtrace@FreeBSD.org Subject: new issues w/ dtrace aborting... Message-ID: <20131017200658.GG56872@funkthat.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.4.2.3i X-Operating-System: FreeBSD 7.2-RELEASE i386 X-PGP-Fingerprint: 54BA 873B 6515 3F10 9E88 9322 9CB1 8F74 6D3F A396 X-Files: The truth is out there X-URL: http://resnet.uoregon.edu/~gurney_j/ X-Resume: http://resnet.uoregon.edu/~gurney_j/resume.html X-to-the-FBI-CIA-and-NSA: HI! HOW YA DOIN? can i haz chizburger? X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.2.2 (h2.funkthat.com [127.0.0.1]); Thu, 17 Oct 2013 13:06:59 -0700 (PDT) X-BeenThere: freebsd-dtrace@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: "A discussion list for developers working on DTrace in FreeBSD." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 17 Oct 2013 20:06:59 -0000 I'm see this failure which is reproducable: # dtrace -s ./disklatencycmd.d -x evaltime=postinit -c ./catall Tracing... Hit Ctrl-C to end. Now, I have spent some time trying to debug this error and have gotten a stack trace: #0 dt_divide_128 (dividend=0x7fffffffd240, divisor=0, quotient=0x7fffffffd240) at /usr/src/cddl/lib/libdtrace/../../../cddl/contrib/opensolaris/lib/libdtrace/common/dt_consume.c:221 #1 0x0000000800a8d343 in dt_stddev (data=0x802bb3e48, normal=1) at /usr/src/cddl/lib/libdtrace/../../../cddl/contrib/opensolaris/lib/libdtrace/common/dt_consume.c:372 #2 0x0000000800a9549a in dt_aggregate_valcmp (lhs=, rhs=) at /usr/src/cddl/lib/libdtrace/../../../cddl/contrib/opensolaris/lib/libdtrace/common/dt_aggregate.c:120 #3 0x0000000800a93821 in dt_aggregate_varvalcmp (lhs=0x804679240, rhs=0x804679260) at /usr/src/cddl/lib/libdtrace/../../../cddl/contrib/opensolaris/lib/libdtrace/common/dt_aggregate.c:950 #4 0x0000000801a3d6aa in qsort () from /lib/libc.so.7 #5 0x0000000800a93c88 in dt_aggregate_walk_sorted (dtp=0x80281d000, func=0x800a8f7d0 , arg=0x7fffffffd498, sfunc=) at /usr/src/cddl/lib/libdtrace/../../../cddl/contrib/opensolaris/lib/libdtrace/common/dt_aggregate.c:1275 #6 0x0000000800a951c6 in dtrace_aggregate_print (dtp=0x80281d000, fp=, func=) at /usr/src/cddl/lib/libdtrace/../../../cddl/contrib/opensolaris/lib/libdtrace/common/dt_aggregate.c:1880 #7 0x000000000040387f in main (argc=, argv=0x7fffffffd700) at /usr/src/cddl/usr.sbin/dtrace/../../../cddl/contrib/opensolaris/cmd/dtrace/dtrace.c:1921 Now the weird part is that between frame 3 and frame 0, the address of lhs/data changes, yet if you look at the code, it is passing the pointer straight through w/o modification... The value pointed to by lhs is valid and non-zero, but in frame 0 the different pointer is now zero.. The script catall: #!/bin/sh - #since dtrace has issues: find /mnt -type f -exec cat {} + > /dev/null The directory /mnt has a FS that only contains a recent export of HEAD.. I also umount/mount before each run to make sure the disk cache is clear, otherwise it's possible that all the data will be cached in memory, and not perform any io... It looks like it's an issue w/ clear(@stddev) as this is the script I've reduced it to reproduce the failiure: #pragma D option quiet #pragma D option dynvarsize=16m io:::start { start_time[arg0] = timestamp; } io:::done /this->start = start_time[arg0]/ { this->delta = (timestamp - this->start) / 1000; @stddev[args[1]->device_name, args[1]->unit_number] = stddev(this->delta ); start_time[arg0] = 0; } dtrace:::END { clear(@stddev); } Obviously for this to be useful, you'd print out the stddev, but the abort happens either way... Thanks. -- John-Mark Gurney Voice: +1 415 225 5579 "All that I will do, has been done, All that I have, has not."