Megamarc, what motivates you to make an engine?
(03-09-2019, 10:40 PM)Domarius Wrote: My only concern is how bland it would sound.  As good as text-to-speech has gotten these days, it still sounds fake and uninteresting.

That's the best part. You wouldn't design it as a text-to-speech solution. You record an actual performance, and then extract relevant data, such as phoneme timing, volume, pitch, etc... Then just incorporate all of that into the playback, using individual phonemes as "instruments." The vast majority of a vocal performance would be preserved, just broken down into key frame data that can be adjusted dynamically. The initial pass for something like this probably wouldn't sound perfect, but it would be a big step up from a text-to-speech solution. Different samples could be taken from different actors in order to create virtual characters for playback.
Ohh ok, so you have different sound banks for the same actor, maybe one shouting angrily, one sad, one excited, etc. etc.?

