How you learn to hear
Posted by: mikeeschman on 24 January 2010
About seven years ago, I got involved in a product that would read text. One of the first things I did was to search out everyone who was trying to do that, and give their results a listen.
It was all too mechanical, and in a way that made you ignore what they were talking about.
It's another story to tell you how I came to what follows, at any rate, it's the way I remember it :-)
The task was to make five million pages of text available to blind people. That means one issue was "how quick can you do it?" I found a solution that would be complete in a year.
One question I had to answer was "what has to be preserved in the presentation to convey the actual intention" - how to convey undistorted meaning.
I surveyed the Gutenberg Project materials for two years, running tests, and broadcasting the spoken results 24-7 for over a year. I got in over 1500 responses that cited words pronounced incorrectly. From this, we built a dictionary that corrected the way these words were spoken, and took tense into account.
In the final analysis, preservation of rhythm and inflection produced audio texts that more people would give a listen to.
It was far from perfect, but I think we advanced the art of text to speech :-)
That is my most direct contact with the idea of reproduction. It is not much, but it is a first person account :-)
Everything I had learned previous about playing and hearing music played a role in how I had handled this text to speech problem. It was the rare chance to see some aspect of yourself the way it really was. Always interesting, and somehow transforming.
So what about Debussy?
It was all too mechanical, and in a way that made you ignore what they were talking about.
It's another story to tell you how I came to what follows, at any rate, it's the way I remember it :-)
The task was to make five million pages of text available to blind people. That means one issue was "how quick can you do it?" I found a solution that would be complete in a year.
One question I had to answer was "what has to be preserved in the presentation to convey the actual intention" - how to convey undistorted meaning.
I surveyed the Gutenberg Project materials for two years, running tests, and broadcasting the spoken results 24-7 for over a year. I got in over 1500 responses that cited words pronounced incorrectly. From this, we built a dictionary that corrected the way these words were spoken, and took tense into account.
In the final analysis, preservation of rhythm and inflection produced audio texts that more people would give a listen to.
It was far from perfect, but I think we advanced the art of text to speech :-)
That is my most direct contact with the idea of reproduction. It is not much, but it is a first person account :-)
Everything I had learned previous about playing and hearing music played a role in how I had handled this text to speech problem. It was the rare chance to see some aspect of yourself the way it really was. Always interesting, and somehow transforming.
So what about Debussy?