Jump to content
The Dark Mod Forums

Recommended Posts

Posted

I think it would be quite a handy tool to have an AI that can generate dialog in the voices that are already included, and even the various voice actors, if they are ok with that ...
and have it run local on your own computer.

Any thoughts on that?

Posted
4 hours ago, datiswous said:

I was looking at this one yesterday:
https://github.com/neonbjb/tortoise-tts
 

 

 

6 hours ago, nbohr1more said:

See:

 

Thanx.

So I see voice actors don't like it and some other people neither.

My idea came when I locked out a guard on a balcony and thought; would be nice is he was yelling some
about being stuck on the balcony in his normal guard voice.
Maybe I/we just have to try what the quality of speech generation is these days.
But .. is there any copyright on these "in game" npc voices?

Posted

I experimented with using AI voices as a placeholder before I asked human actors to do the lines. There are still morality and copyright questions regarding AI voice usage, but that aside, I just find the technology isn't there yet.

 

Here is a comparison from a Builder sermon scene I have in Shadows of Northdale Act 3.

 

This is an AI voice I used for the scene originally as a placeholder

 

 

And this was the final human voice acted version

 

 

In my opinion the human version is infinitely better than the AI generated one. The human voice sounds much more natural, plus there are nuances and inflections the voice actor can bring to the script which can change it in ways that an AI voice can't. Additionally, some voice actors i've worked with will go off script or offer alternative versions, some of which end up being used (a couple of improv lines that AndrosTheOxen used for the shopkeeper in Noble Affairs we're featured in the final game).

  • Like 2
Posted (edited)

Well if you have voice actors available than the choice is easy. But I think it could be used to generate alternative voices, or when you don't have the time to contact voice actors, or maybe just for some testing purposes. Also, if you want to have voices in different languages, it could be helpful.

 

34 minutes ago, Goldwell said:

This is an AI voice I used for the scene originally as a placeholder

 

This is pretty good I have to say. What software/site did you use?

Edited by datiswous
Posted
10 minutes ago, datiswous said:

This is pretty good I have to say. What software/site did you use?

https://elevenlabs.io/

  • Thanks 1
Posted (edited)
15 hours ago, STRUNK said:

I was looking at this one yesterday:
https://github.com/neonbjb/tortoise-tts

Yeah I saw it, but the demo page is not able to generate a voice from audio file. The WhisperSpeech demo does have this ability. I tested this and it does generate a pretty realistic voice, it's not really a copy of the voice you supply. Also if the supplied voice has a heavy accent it makes mistakes.

Edited by datiswous
Posted (edited)

To be fair, the A.I. voices are better than I expected, but, I agree with Goldwell that humans still have the edge. Expectedly. ;) 

If that's not an option, then A.I. voices surely will do. I like that the timbre of the voice is really close to the Hammerites Stephen Russell voiced. I guess you took that as a sample?

Edited by chakkman
Posted
6 hours ago, datiswous said:

Yeah I saw it, but the demo page is not able to generate a voice from audio file. The WhisperSpeech demo does have this ability. I tested this and it does generate a pretty realistic voice, it's not really a copy of the voice you supply. Also if the supplied voice has a heavy accent it makes mistakes.

I tried that yesterday and I thought it sounded pretty bad : P
The thing is, there must also be a way to steer the speech output, and train voice sets of the dfferent npc's; bored, angry, alarmed etc. to be really usefull .. I guess. Like training LoRa's for image generation ...
I'll try to install and use tortoise TTS this weekend.

Posted

Ik got this tortoise TTS up and running after some hassle and made a model for Builder1.
Builder one has just 4 audiofiles and I ran 500 epochs on it, what might be way too much for such a small sample size, and the model is about 1.6Gb, but it seems to work quite nicely.
Some outputs sounded a bit strange and for most of them I had to cut of the start of the audiofile for there was some garbage.
I still have to play around with settings but for now it's looking (sounding) quite nice.

 

To install it I followed this tutorial, though some things differ a bit when you install version 3 that came after this tutorial:

He is using some other programs to remove background noise and to prepare the audio, but the voices in TDM are already clean so no need for that. What you will have to do is convert all the ogg files to wav (batch convert with vlc player) for tortoise TTS to be able to handle them.
You also need an Nvidia graphics card.
That said, on my rtx5000 laptop GPU it all takes a lot of time ...

 

 

 

  • Like 1
Posted

The cloning of the characteristics of the voice works quite nice.
I selected sets of audio clips by how "loud" the speech is, Loud, Normal and Soft (speaking up, speaking normal, speaking soft).
I didn't figure out how to get the best audio quality, without weird "audio artifacts".
But as demonstrated down here, the moor certainly sounds like the moor:

 
Posted
1 hour ago, datiswous said:

That's pretty amazing. So this is not related to tortuose-tts?

F5-TTS is a different, non autoregressive model.
The moor voice was done with tortoise-TTS.

Posted (edited)

It's pretty remarkable what's possible these days. Maybe large voice samples are no longer even needed.

If they are though, and anyone were looking for samples to train new characters, I suggest considering LibreVox recordings. Since these readings are all in the public domain, the legal and ethical case for using them to create derivative works with AI is much less fraught. LibreVox even says so themselves: https://wiki.librivox.org/index.php?title=LibriVox_and_Artificial_Intelligence_(AI)

In particular Frankenstein, or the Modern Prometheus (version 3) narrated by Caden Vaughn Clegg is excellent. I could imagine a voice and style of intonation like Clegg's main reading pattern fitting really well in the The Dark Mod. His monster voice is not bad either. Some other interesting ones are Greg Bryant in Paradise Lost. His performances could make for a good player character voice. He has a similar gruffness to Stephen Russel as Garrett, but with a different overtone. Lastly Cori Samuel gives some really good performances that could be suitable for young women characters, especially those of noble or plucky-roguish backgrounds.

Edited by ChronA

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

  • Recent Status Updates

    • JackFarmer

      "The Year of the Rat." 
      😄

      Al Stewart must be proud of you!
      Happy testing!
      @MirceaKitsune
      · 1 reply
    • datiswous

      I posted about it before, but I think the default tdm logo video looks outdated. For a (i.m.o.) better looking version, you can download the pk4 attached to this post and plonk it in your tdm root folder. Every mission that starts with the tdm logo then starts with the better looking one. Try for example mission COS1 Pearls and Swine.
      tdm_logo_video.pk4
      · 2 replies
    • JackFarmer

      Kill the bots! (see the "Who is online" bar)
      · 3 replies
    • STiFU

      I finished DOOM - The Dark Ages the other day. It is a decent shooter, but not as great as its predecessors, especially because of the soundtrack.
      · 5 replies
    • JackFarmer

      What do you know about a 40 degree day?
      @demagogue
      · 4 replies
×
×
  • Create New...