[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Emacs and Festival

From: David_Picón_Álvarez <david@miradoiro.com>
Subject: Re: Emacs and Festival
Date: Sun, 31 Dec 2006 11:46:41 +0100

> First sorry about my replies, my mail agent atm isn't adding the quoting
> sign for some obscure reason.
> From: "Tim Cross" <tcross@rapttech.com.au>
> You are correct that festival offers more sophisticated speech
> services, but I believe one of the biggest criticisms of festival was
> that it was not as responsive/fast as other systems, in particular
> flite, which is a light weight version.
> That criticism was, and to some extent is, correct. Yet computers become
> ever faster and get more RAM, and festival's getting usable about now. At
> least for me. So it would be really nice to be able to use it. However, from
> the rest of your reply, this won't be possible, or at least not easy.

I agree that as CPUs and memory increase in speed/size, festival becomes more
acceptable wrt performance. I've thought about a propper interface between
emacspeak and festival, but its low on the project list at present.

> With respect to TTS engines for emacspeak, I think by far the best is
> IBM's ViaVoice. Unfortunately, it is difficult to get runtime licenses
> for the software. There has been promises of individual runtime
> licenses being made available, but as far as I know, this has not yet
> occured. Therefore, the only way to easily get a runtime is to buy the
> SDK, which is around $300US. I was lucky enough to convince my
> employer to purchase it for me and its worked really well.
> Could you give me some direction on who is selling the SDK? Just a URL would
> do, or if you have more info about pricing etc it would also be welcome but
> I can as well check it on the web.

Wizzard Software is where I got it from. I think the URL was

> By the way, there's a new(?) free (as in speech) TTS engine called eSpeak
> which in my view is quite good in several ways that matter, among which are
> responsiveness, speed and clarity at high speed. Maybe one day it will also
> be possible to use it with emacspeak.

Yes, I downloaded and installed this package a few days back and was quite
impressed with the quality of the speech - certainly as good as flite IMO.
Depending on the API, it may not be too difficult to create a speech server for
emacpseak which uses espeak. Essentially, the following would need to be done -

1. Create an espeak-voices.el file (see one of the others, ike
outloud-voices.el). This file just defines a few basic commands that emacspeak
will use to generate text and any "controls" necessary to use espeak. Emacspeak
uses the functions defined in this file to build up the strings of text and
commands that are sent to the tcl interface for the speech server.

2. Create an espeaktcl.so library that can be loaded by an espeak.tcl script.
This library essentially creates new high level Tcl commands for doing basic
text-to-speech operations. The tcl commands are defined in this library file
because they are built with low level API calls to both the TTS engine and the
Linux sound layer (either ALSA or OSS, though ALSA is preferred these days). As
this library is a Tcl extension library, it creates commands that are seen by
Tcl in the same way as standard Tcl commands. There are a set of these commands
which must exist. These can be determined by looking at the code for one of the
existing libraries, such as atcleci.cpp (for ViaVoice) or dtktcl.c (for
software dectalk). 

3. Create a Tcl espeak script. This script reads in text passed to it from
emacs, performs some basic processing (such as modifying text to handle special
characters, some text cleanup and handling of punctuation, dealing with
capitals, repeated patterns of text etc) and finally sending the text to be
converted into speech and sent to the sound layer. It uses the commands defined
in the library outlined in [2] above and additional subroutines defined in the
tts-lib.tcl file in the servers directory. 

All of this is not as complex as it sounds as there is a lot of overlap between
different servers. Essentially, you can grab the atcleci.cpp file and use it as
a skeleton. You shouldn't need to modify any of the alsa specific code and
would merely need to replace the ViaVoice library calls with ones which would
essentially generate the necessary wav data from specific text and pass that to
alsa. You then would start off with a basic espeak tcl script that did nothing
fancy and you would have an interface and tts server which was as good as the
one for flite (possibly better as it would be of similar status to the dtk-soft
and Viavoice servers ad have a higher likelihood of being incorporated into
emacspeak propper).


To unsubscribe from the emacspeak list or change your address on the
emacspeak list send mail to "emacspeak-request@cs.vassar.edu" with a
subject of "unsubscribe" or "help"

If you have questions about this archive or had problems using it, please send mail to:

priestdo@cs.vassar.edu No Soliciting!

Emacspeak List Archive | 2007 | 2006 | 2005 | 2004 | 2003 | 2002 | 2001 | 2000 | 1999 | 1998 | Pre 1998

Emacspeak Files | Emacspeak Blog