From owner-freebsd-multimedia Fri Dec 12 12:33:10 1997 Return-Path: Received: (from root@localhost) by hub.freebsd.org (8.8.7/8.8.7) id MAA22229 for multimedia-outgoing; Fri, 12 Dec 1997 12:33:10 -0800 (PST) (envelope-from owner-freebsd-multimedia) Received: from cerberus.partsnow.com (gatekeeper.partsnow.com [207.155.26.98]) by hub.freebsd.org (8.8.7/8.8.7) with ESMTP id MAA22218 for ; Fri, 12 Dec 1997 12:33:04 -0800 (PST) (envelope-from don@partsnow.com) Received: (from bin@localhost) by cerberus.partsnow.com (8.8.5/8.6.9) id MAA08775; Fri, 12 Dec 1997 12:32:41 -0800 (PST) X-Authentication-Warning: cerberus.partsnow.com: bin set sender to using -f Received: from wildeweb(192.168.100.10) by cerberus.partsnow.com via smap (V2.0) id xma008772; Fri, 12 Dec 97 12:32:13 -0800 Message-ID: <34919F14.1295DF6D@partsnow.com> Date: Fri, 12 Dec 1997 12:31:16 -0800 From: Don Wilde Organization: Soligen, Incorporated X-Mailer: Mozilla 4.04 [en] (X11; I; FreeBSD 2.2.5-RELEASE i386) MIME-Version: 1.0 To: Amancio Hasty , don@partsnow.com CC: Multi Media Subject: Re: remote controls References: <199712121824.KAA02267@rah.star-gate.com> Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Sender: owner-freebsd-multimedia@freebsd.org X-Loop: FreeBSD.org Precedence: bulk -- Amancio Hasty wrote: > > Yeap, I now about that hole in our current multimedia offerings and > I am still keeping an eye on how to get voice-recognition into > FreeBSD. > The voice code was actually very simple, and ran on the 8-bit microcontroller. Basically, you pressed a button to get it to listen, it sent out a TV MUTE command, then it sampled 5 frequency bands until sound went away or it ran out of DRAM, then it did a quick FFT and compared against its known samples. The code was further constrained such that it would only expect a limited subset of the words at any given point in the programming sequence, and that it had to be trained for each of four possible voices. Our pentiums ought to be able to perform software filtering on a disk file stream in near-real-time, certainly well enough for discrete speech command recognition, even without such tricks as MMX. It depends what you want. If you expect the world, i.e. full connected speaker-independent recognition of umpteen thousands of words, ainna gonna happen. Kurzweil and Dragon have been fighting for years with megadollars of R&D backing to get to the point where such systems are even minimally usable. Command recognition with an attention-button trigger is fairly simple. Even command-word trigger, where the input is always scanning to hear its trigger word, is not exhorbitantly processor-intensive. Could be done :) I'll tell you, though, Amancio, we're sure building _expensive_ TV's here...! oooOOO O O O o * * * * * * o ___ _________ _________ ________ _________ _________ ___==_ V_=_=_DW ===--- Don Wilde [don@PartsNow.com] [http://www.PartsNow.com ] /oo0000oo-oo--oo-ooo---ooo-ooo---ooo-ooo--ooo-ooo---ooo-ooo---ooo-oo--oo