PageRenderTime 26ms CodeModel.GetById 20ms app.highlight 3ms RepoModel.GetById 1ms app.codeStats 0ms

/native/external/espeak/docs/mbrola.html

http://eyes-free.googlecode.com/
HTML | 137 lines | 129 code | 8 blank | 0 comment | 0 complexity | 99fc2e653d891d5295cd8309b35390a2 MD5 | raw file
  1<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
  2<html>
  3
  4<head>
  5  <title>espeakedit: Mbrola Voices</title>
  6  <meta http-equiv="Content-Type" content="text/html; charset=utf-8">
  7</head>
  8<body>
  9<A href="docindex.html">Back</A>
 10<hr>
 11<h2>MBROLA VOICES</h2>
 12<hr>
 13The Mbrola project is a collection of diphone voices for speech synthesis.  They do not include any text-to-phoneme translation, so this must be done by another program.  The Mbrola voices are cost-free but are not open source.  They are available from the Mbrola website at:<br>
 14  <a href="http://www.tcts.fpms.ac.be/synthesis/mbrola/mbrcopybin.html">http://www.tcts.fpms.ac.be/synthesis/mbrola/mbrcopybin.html</a>
 15
 16<p>
 17eSpeak can be used as a front-end to Mbrola.  It provides the spelling-to-phoneme translation and intonation, which Mbrola then uses to generate speech sound.
 18
 19<h3>Voice Names</h3>
 20
 21To use a Mbrola voice, eSpeak needs information to translate from its own phonemes to the equivalent Mbrola phonemes.  This has been set up for only some voices so far.
 22<p>
 23The eSpeak voices which use Mbrola are named as:<br>
 24 &nbsp; <b>mb-</b>xxx
 25<p>
 26where xxx is the name of a Mbrola voice (eg. <b>mb-en1</b>  for the Mbrola "<b>en1</b>" English voice).  These voice files are in eSpeak's directory <code>espeak-data/voices/mbrola</code>.
 27<p>
 28The installation instructions below use the Mbrola voice "en1" as an example.  You can use other mbrola voices for which there is an equivalent eSpeak voice in <code>espeak-data/voices/mbrola</code>.
 29<p>
 30There are some additional eSpeak Mbrola voices which speak English text using a Mbrola voice for a different language.  These contain the name of the Mbrola voice with a suffix <b>-en</b>.  For example, the voice <b>mb-de4-en</b> will speak English text with a German accent by using the Mbrola <b>de4</b> voice.
 31
 32<h3>Windows Installation</h3>
 33
 34The SAPI5 version of eSpeak uses the mbrola.dll.
 35<ol>
 36<li>Install eSpeak. Include the voice <b>mb-en1</b> in the
 37list of voices during the eSpeak installation.
 38<p>
 39<li>Install the PC/Windows version of Mbrola (MbrolaTools35.exe) from:
 40<a href="http://www.tcts.fpms.ac.be/synthesis/mbrola/bin/pcwin/MbrolaTools35.exe"> http://www.tcts.fpms.ac.be/synthesis/mbrola/bin/pcwin/MbrolaTools35.exe</a>.
 41<p>
 42<li>Get the <b>en1</b> voice from:
 43<a href="http://www.tcts.fpms.ac.be/synthesis/mbrola/mbrcopybin.html"> http://www.tcts.fpms.ac.be/synthesis/mbrola/mbrcopybin.html</a>
 44unpack the archive, and copy the "<b>en1</b>" data file (not the whole "en1"
 45directory) into
 46<code>C:/Program Files/eSpeak/espeak-data/mbrola</code>.
 47<p>
 48<li>Use the voice <b>espeak-MB-EN1</b> from the list of SAPI5 voices.
 49</ol>
 50<h3>Linux Installation</h3>
 51
 52I don't think there's a Linux shared library version of Mbrola (equivalent to mbrola.dll), so eSpeak has to pipe phoneme data to the command-line Mbrola.
 53<ol>
 54<li>To install the Linux Mbrola binary, download:
 55<a href="http://www.tcts.fpms.ac.be/synthesis/mbrola/bin/pclinux/mbr301h.zip"> http://www.tcts.fpms.ac.be/synthesis/mbrola/bin/pclinux/mbr301h.zip</a>.
 56Unpack the archive, and copy and rename the file: <code>mbrola-linux-i386</code> to
 57<code>mbrola</code> somewhere in your executable path (eg. <code>/usr/bin/mbrola</code> ).
 58<p>
 59<li>Get the en1 voice from:
 60<a href="http://www.tcts.fpms.ac.be/synthesis/mbrola/mbrcopybin.html"> http://www.tcts.fpms.ac.be/synthesis/mbrola/mbrcopybin.html</a>.
 61Unpack the archive, and copy the "<b>en1</b>" data file (not the whole "en1"
 62directory) somewhere convenient (eg. <code>/usr/share/mbrola/en1</code> ).
 63<p>
 64<li>If you use the eSpeak voice "<b>mb-en1</b>" then eSpeak will generate
 65Mbrola phoneme data on its stdout.  You can pipe this into Mbrola.
 66<p>
 67<code>espeak -v mb-en1 -f textfile | mbrola -e /usr/share/mbrola/en1 -
 68test.wav</code>
 69<p>
 70will put the Mbrola speech output into a WAV file.  Or you can pipe the output from Mbrola through aplay:
 71<p>
 72<code>espeak -v mb-en1 -f textfile | mbrola -e /usr/share/mbrola/en1 - - | aplay -r16000 -fS16</code>
 73<p>
 74The -e option prevents Mbrola from stopping if it finds a combination
 75of phonemes which it doesn't recognise.
 76<p>
 77Some mbrola voices (de5, de6) use a sample rate of 22050 Hz. These need -r22050 rather than -r16000.
 78</ol>
 79<h3>Mbrola Voice Files</h3>
 80
 81eSpeak's voice files for Mbrola voices are in directory <code>espeak-data/voices/mbrola</code>.  They contain a line:<br>
 82 &nbsp; <code>mbrola  &lt;voice&gt;  &lt;translation&gt;</code>
 83<br>
 84eg.<br>
 85 &nbsp; <code>mbrola  en1  en1_phtrans</code>
 86<ul>
 87<li><b>&lt;voice&gt;</b> is the name of the Mbrola voice.
 88<p>
 89<li><b>&lt;translation&gt;</b> is a translation file to convert between eSpeak phonemes and the equivalent Mbrola phonemes.  These are kept in:
 90  <code>espeak-data/mbrola_ph</code>
 91</ul>
 92They are binary files which are compiled, using espeakedit, from source files in <code>phsource/mbrola</code>, see below.
 93<h3>Mbrola Phoneme Translation Data</h3>
 94Mbrola phoneme translation files specify translations from eSpeak phoneme names to mbrola phoneme names.  They are referenced from voice files.
 95<p>
 96The source files are in <code>phsource/mbrola</code>.  These are compiled using the <code>espeakedit</code> program (<code>Compile->Compile mbrola phonemes list</code>) to produce data files in <code>espeak-data/mbrola_ph</code> which are used by eSpeak.
 97<p>
 98Each line in the mbrola phoneme translation file contains:
 99<p>
100<code>
101&lt;control&gt; &lt;espeak ph1&gt; &lt;espeak ph2&gt; &lt;percent&gt; &lt;mbrola ph1&gt; [&lt;mbrola ph2&gt;]
102</code>
103<ul>
104<li><b>&lt;control&gt;</b><ul>
105<li>bit 0 &nbsp;  skip the next phoneme
106<li>bit 1 &nbsp;  match this and Previous phoneme
107<li>bit 2 &nbsp;  only at the start of a word
108<li>bit 3 &nbsp;  don't match two phonemes across a word boundary
109</ul><p>
110<li><b>&lt;espeak ph1&gt;</b><br>
111The eSpeak phoneme which is to be translated to an mbrola phoneme.
112<p>
113<li><b>&lt;espeak ph2&gt;</b><br>
114If this field is not <code>NULL</code>, then the match only occurs if this field matches the next phoneme.  If control bit 1 is set, then the <i>previous</i> rather than the <i>next</i> phoneme is matched.  This field may also have the following values:<br>
115<code>VWL</code> &nbsp; matches any Vowel phoneme.
116<p>
117<li><b>&lt;percent&gt;</b><br>
118If this field is zero then only one mbrola phoneme is used.  If this field is non-zero, then two mbrola phonemes are used, and this value gives the percentage length of the first mbrola phoneme.
119<p>
120<li><b>&lt;mbrola ph1&gt;</b><br>
121The mbrola phoneme to which the eSpeak phoneme is translated.  This field may be <code>NULL</code>.
122<p>
123<li><b>&lt;mbrola ph2&gt;</b><br>
124The second mbrola phoneme.  This field is only used if the &lt;percent&gt; field is not zero.
125<p>
126</ul>
127The list is searched from start to finish, until a match is found.  Therefore, a line with more specific match condition should appear before a line which matches the same eSpeak phoneme but with a more general condition.
128<p>
129The file <code>dictsource/dict_phonemes</code> lists the eSpeak phonemes which are used for each language.  Translations for all these should be given in the mbrola phoneme translation file.  In addition, some phonemes which are referenced from phoneme files (eg. <code>phsource/ph_language, phsource/phonemes</code>) in lines such as:<pre>
130   beforenotvowel   l/
131   reduceto  a#  0
132</pre>
133should also be included, even though they don't appear in <code>dictsource/dict_phonemes</code>.
134<p>
135If the language's *_list or *_rules files includes rules to speak words "as English" the mbrola phoneme translation file should include rules which translate English phonemes into near equivalents, so that they can spoken by the mbrola voice.
136</body>
137</html>