Jump to content
The Dark Mod Forums

English Subtitles for AI Barks


Geep

Recommended Posts

FYI, as this project goes forward, I plan to work first on barks for which a text vocal script is available.

Complete bark text are available - via the wiki's "Voices" - for these:

  • The Thug [done]
  • The Grumbler [done]
  • The Pro [done]
  • The Mature Builder (Builder3) [done]
  • The Young Builder (Builder4) [done]
  • The Lord [done]
  • Simpleton [done]
  • Average Jack [done]
  • The Lady [done]
  • The Wench [done, final revision]
  • The Commander [done]
  • The Moor [done]
  • The Maiden [done]

These vocal sets are also referenced under "Voices", but the bark text is missing:

  • Builder 1 & 2* [done]
  • The Drunk [done]
  • The Cynic [done]
  • The Lady02 [done]
  • The Critic [done]

(* Builder 1 just uses Builder 2 vocal assets. Also, beyond the scope here: there are a few "conversation" clips used in the Saint Lucia FM, not intended for general use, even though they are distributed with the core. These differ between Builder 1 and 2, and are "verbosity story". Their subtitles are distributed in the core as file tdm_sound_vocals_decls01.pk4\tdm_stlucia.subs )

If you have bark text for any of the missing vocal sets, please get me a copy. Thanks.

There are additional vocal sets not referenced under "Voices". These are low priority for me. Probably all except the newer manbeast (which is the only one that has a substantial amount of English) will be put off beyond 2.12 release. They are:

  • manbeast [done]
  • player [mainly various grunts]
  • raven
  • horse
  • generic
  • elemental
  • spider
  • revenant
  • zombie
  • werebeast

Further out still, there are numerous non-AI sound clips for which "effects" subtitles might be provided. Many of these are found in tdm_sound_sfx01.pk4 and tdm_sound_sfx02.pk4

Edited by Geep
Update those marked as [done].
  • Like 1
Link to comment
Share on other sites

Question: If an fm author overrides an ai bark with it's own soundfile by using the same filename (assuming this goes this way), does the core subtitle still apply?

Edit: maybe it is set in the sndshd file(s). Still same question.

Edited by datiswous
  • Like 1
Link to comment
Share on other sites

You'd certainly want that override to happen. I don't know if it does. You could try a test by providing an override .ogg & subtitle.

Specifically, as an override target, in 2.11, Dragofer provided a dozen subtitles for The Cynic voice as part of the core; see tdm_sound_vocals_decls01.pk4\subtitles\tdm_ai_cynic.subs.

If you want to make a version of my testSubtitles... FM specifically for these dozen, here is content for an appropriate fm_test_subtitles_shaders.sndshd:

Spoiler

// This content is generated; change only if necessary.
// It has custom sound shaders to aid development of subtitles for The Dark Mod (TDM).
// See project: //forums.thedarkmod.com/index.php?/topic/21740-english-subtitles-for-ai-barks/
// It's meant to be used ONLY with an instance of FM 'testSubtitles', typically renamed for a
// particular AI vocal set, and referred to here as <fm>.
// File deployment location: <fm>/sound/ or below.
// Typical file name: fm_test_subtitles_shaders.sndshd
// Only the file extension is prescribed.
// The sound shader prefix below (fm_test_subtitles_shader) should match that #defined in the <fm>'s .script file.

// IMPORTANT: See bottom of this file for the note about MAX_SHADER_NUMBER.

// All TDM sound shaders here use defaults for everything: minDistance, maxDistance, volume, etc.
// The starting (zero) shader is for a special sound indicating start of list, used with all sets of audio samples.

fm_test_subtitles_shader0    { sound / subtitle_list_start_hit_high01.ogg }

// The remaining shaders wrap sound files to be played along with their subtitles under test.
// These .ogg & .wav files are in an expected location,
// either within this fm, or within a .pk4 of the standard tdm distribution.

fm_test_subtitles_shader1    { sound/voices/cynic/tdm_ai_berny_combat_hit_player_company01.ogg }
fm_test_subtitles_shader2    { sound/voices/cynic/tdm_ai_berny_combat_hit_player_company02.ogg }
fm_test_subtitles_shader3    { sound/voices/cynic/tdm_ai_berny_combat_hit_player01.ogg }
fm_test_subtitles_shader4    { sound/voices/cynic/tdm_ai_berny_combat_hit_player02.ogg }
fm_test_subtitles_shader5    { sound/voices/cynic/tdm_ai_berny_combat_throw01.ogg }
fm_test_subtitles_shader6    { sound/voices/cynic/tdm_ai_berny_combat_melee01.ogg }
fm_test_subtitles_shader7    { sound/voices/cynic/tdm_ai_berny_combat_melee02.ogg }
fm_test_subtitles_shader8    { sound/voices/cynic/tdm_ai_berny_combat_melee03.ogg }
fm_test_subtitles_shader9    { sound/voices/cynic/tdm_ai_berny_combat_melee05.ogg }
fm_test_subtitles_shader10    { sound/voices/cynic/tdm_ai_berny_enemy_out_of_reach01.ogg }
fm_test_subtitles_shader11    { sound/voices/cynic/tdm_ai_berny_enemy_out_of_reach02_2.ogg }
fm_test_subtitles_shader12    { sound/voices/cynic/tdm_ai_berny_enemy_out_of_reach03.ogg }

// Change corresponding value in this test app's .script file to:
// #define MAX_SHADER_NUMBER 12

BTW, this voice is used by two AI characters:

  1. tdm_ai_guard_citywatch (defined in tdm_ai_humanoid_guards01\def\tdm_ai_guard_citywatch.def)
  2. tdm_ai_thief (in tdm_ai_humanoid_guards01\def\tdm_ai_thief.def)
  • Like 1
Link to comment
Share on other sites

The Wench voice is the first one I've tackled that actually needs some audio clips to be done with srt.

For that, I've downloaded and installed "Cadet", free captioning software produced by public TV station WGBH in Boston. Working with one clip, I was able to produce a srt file. Painful process, though. I clearly need to look at more Cadet tutorials....🤔

  • Like 1
Link to comment
Share on other sites

On 3/9/2023 at 10:19 PM, datiswous said:

Question: If an fm author overrides an ai bark with it's own soundfile by using the same filename (assuming this goes this way), does the core subtitle still apply?

Yes, the original subtitle would apply.

What is really bad is that you cannot override subtitle for one sound easily.
Now you can override one decl, but a decl often contains many subtitles.

  • Like 1
  • Sad 1
Link to comment
Share on other sites

On 3/11/2023 at 6:09 PM, Geep said:

The Wench voice is the first one I've tackled that actually needs some audio clips to be done with srt.

For that, I've downloaded and installed "Cadet", free captioning software produced by public TV station WGBH in Boston. Working with one clip, I was able to produce a srt file. Painful process, though. I clearly need to look at more Cadet tutorials....🤔

Have you seen this topic?

I recommend you to use Kdenlive for positioning the subtitles in the right place. Pretty easy.

image.thumb.png.06d604033e7905085202fdd69641659e.png

 

Edited by datiswous
  • Like 2
Link to comment
Share on other sites

I actually tested multiple subtitle editors before sticking to this one.

Btw. with the scrollwheel you can move back and forward on the track. With Ctrl + scrollwheel you can zoom in. Also, when you want to edit a subtitle, you first have to click inside the edit subtitle area before you can type, otherwise you get a warning.

Link to comment
Share on other sites

Good to know. It will probably be Thursday/Friday before I get to this.

Quick question. There seems to be two popular styles of adding a leading speaker ID. With colon, e.g. "Jack: " and in parentheses, e.g. "(Jack) ". With a TDM conversation, at the start of an srt, what style do you like?

For both styles, general subtitling recommendations seem to say to put the ID on the first line by itself. For TDM, that might have to be amended to "...except where the subtitle phrase doesn't fit entirely on the second line."

BTW, for barks, I'm assuming the ideal max line length is 42 characters/line.

Link to comment
Share on other sites

On 3/3/2023 at 7:52 PM, Geep said:

Testing FM (now with instructions also in Objectives, as requested by @datiswous). Assumes TDM 2.11

testSubtitlesLord.pk4

I noticed you centered the subtitles. Problem is, already created subtitles don't always fully fit in that box, so some words at the end of long sentences are not visible anymore. So for my own testing I removed tdm_subtitles_common.gui .

Link to comment
Share on other sites

@datiswous, the centering of the subtitles was the stock tdm_subtitles_common default, so I left it that way with my The Lord customization. I'm not sure centering (versus, say, left justification) is really the issue with respect to subtitles fitting in the box.

To that point, as a demonstration with testSubtitlesLord, I used rather aggressive side margins of 170 (so a field width of 300), which can just-barely accommodate 42 characters/line. But that means, depending on word breaks, some lines with 42 or 41 characters will force an autobreak, as it sounds like you observed.

While margins of 170 is the minimum required to avoid overlap with fully-enlarged inventory icons, I think such overlap is less critical than avoiding unwanted word-wrap. So going with margins of 160 & field width of 320 is better going forward, while still avoiding the egregious overlap of the stock tdm_subtitles_common.

Can you live with a character-length restriction of 42 characters/line? I think (given 2 lines) that should be adequate for almost all srt phrases that are limited to about 6 seconds.

Another approach to strike a balance between margin overlap and word-wrap would be to reduce the font scale slightly.

And in the long run, it would probably be desirable to have a CVar that the user could use to scale both font size and corresponding field height & width (and probably interfield vertical spacing). This could be shown as part of the HUD Settings special screen, even if internally it is a separate layer. Then the user could find their own compromise.

EDIT: I'll copy this to the "futures" thread, given the "long run" ideas.

Edited by Geep
  • Like 1
Link to comment
Share on other sites

Another utility program, "findToolLongSubtitles", is now available, which scans a directory for .subs and .srt files, and checks the length in characters of each subtitle line against a maximum fieldwidth expressed in characters:
Win executable
C++ source code file

It is more fully described in the latter as:

findTooLongSubtitles.cpp  By Geep, March, 2023, for The Dark Mod, under the terms of its open-source license.

Purpose: Given a particular subtitle maximum fieldwidth, evaluates TDM subtitles - contained in .subs and .srt files - and reports those that don't fit. Assumes a maximum 2-line subtitle field. If a subtitle doesn't currently fit (or suboptimally relies on auto-word-wrap to fit), but could be made good by inserting or adjusting a linebreak, locations where that break could be positioned are shown.

This program only examines a single folder at a time for contained .subs and .srt files. If your FM has these files in multiple places, run this program more than once.

For an "inline" subtitle, a string-embedded "\n" causes a manual linebreak. When shown in this program's output, that subtitle has 2 lines, as in the game. This allows use of a common output routine for inline & srt subtitles.

Console program invocation:
findTooLongSubtitles -m maxSubtitleCharsPerLine [default is 42] -d dirWithSoundFiles [default is current dir] -o output file [default is stdout]

Build: Requires C++ 17 or later

For example outputs, evaluating subtitles found in the 2.11 releases of FMs New Job and St. Lucia against a proposed 42-character fieldwidth, see here.

 

Edited by Geep
Improve description
  • Like 1
Link to comment
Share on other sites

FYI, as I work on The Wench, I'm adopting the following style to using parentheses. This differs in my earlier treatment for The Thug and The Lord, where square brackets were more frequent, with a different meaning. (I plan to revisit those subtitles for this and other reasons, after The Wench).

[Fragment from my eventual style guide:]

TO BE DETERMINED: Some styles of including a speaker ID in a subtitle use parentheses or square brackets. This is not important for subtitling barks, but may be for “verbosity story” subtitles.

Parentheses. These are used to refer to what the AI is saying/vocalizing. The most common purpose is:

  • Descriptors for non-word vocalizations that are not otherwise represented, e.g., (coughing) (sneezes)

Other purposes, used sparingly, are:

  • Descriptors for vocal style or sound quality, e.g., (sing-song) (hoarsely)
  • Indicating the speaker’s physical or emotional state, e.g. (surprised) (sleepily) (drowning)
  • Providing stage directions or context, e.g., talking (to buddy)
  • Indicating the speaker is actually talking in a foreign language, though shown in English.

As the examples indicate, the text within parentheses should be all-lower-case and relatively short (for CPS/WPM considerations). If it refers to the entire phrase, put it at the beginning, and, if using a verb, favor the “...ing” form, e.g. (humming). If it refers to a particular location within the sound file, if using a verb, favor the active-form, e.g., (hums). For a long sound, consider indicating its conclusion: (humming done) or (/humming).


Do NOT use parentheses (or square brackets) to enclose speech to indicate:

  • whispering or sotto-voce. Instead, add a prefix word (whispers), (whispering), (confidentially), etc.
  • asides by the speaker. Dashes can be helpful to set off such speech, or an embedded “(aside)”.

Be modest in indicating the loudness of a bark; while the voice tone can be evident, the actual volume that a bark is emitted is not under the subtitler’s control.


Square Brackets. These are:

  • Mainly reserved for future “verbosity effects” sounds.
  • Can be used for non-vocal sound included with a bark clip, whether associated with the AI or not, but important enough to warrant a subtitle mention. This will be rare. Example: [claps].

Curly Braces. TDM fonts do not support these.

 

  • Like 1
Link to comment
Share on other sites

  • 2 weeks later...

The Wench bark subtitles are essentially done, but there are some issues with a handful of .ogg files (see new bug tracker report #6284). So I'm holding up release pending resolution.

I'm going to briefly revisit the subtitles for the first 2 AI characters, to make better use of SRT and of the new duration-extensions capabilities in 2.12dev.

  • Like 1
Link to comment
Share on other sites

On 2/22/2023 at 7:52 PM, Geep said:

@datiswous and others, you can get my Windows console program buildSubtitleShaders.exe now at:
https://drive.google.com/file/d/19Pf513nv5gwOzZ5tyWHJN7Ka-D9UxBET/view?usp=sharing

Just reporting this seems to work fine in Linux via wine in terminal, except you have to specify the export file, otherwise the file is not made.

Link to comment
Share on other sites

Question:

Let's say you have multiple folders with speech sound files. Can you specify multiple sources to generate 1 sndshd for?

Feedback: It would be nice if you can move through the soundfiles without them all getting played automatically.

Edited by datiswous
Link to comment
Share on other sites

1 hour ago, datiswous said:

Just reporting this seems to work fine in Linux via wine in terminal, except you have to specify the export file, otherwise the file is not made.

??? If you don't specify the output file, then it writes to stdout. The expected behavior would be you would see the output on the terminal screen, unless you did something like "> my_new_results.sndshd". Are you not seeing that with wine?

1 hour ago, datiswous said:

Let's say you have multiple folders with speech sound files. Can you specify multiple sources to generate 1 sndshd for?

No, you currently would have to run the program separately for each folder and cut/paste to merge the outputs by hand. Since that's kinda trivial, and programming for multiple input folders less so, probably won't be happening.

1 hour ago, datiswous said:

Feedback: It would be nice if you can move through the soundfiles without them all getting played automatically.

I'm thinking here you're talking about a "testSubtitle..." FM. So in addition to jumping to a particular soundfile by changing the cursub value in the console, you'd maybe like 2 more in-game buttons, to skip forward and skip backwards?

  • Like 1
Link to comment
Share on other sites

6 hours ago, datiswous said:

Well you have to correct all the shadernr's because they all start at 1. But I can live with it

Yeah, that is a pain in the butt.

The main problem with multiple input folders is that you really want to be able to specify relative paths on the command line (i.e., not just absolute), but then reformulate the paths into the form that TDM wants in the sound shader. I tried to code that earlier, and ended up with a real mess. One of those situations where the more you try to fix, the worse it gets. So I abandoned that attempt.

What I could do instead, fairly easily, is to provide another command line option that would let you specify a starting number greater than 1, so you wouldn't have to separately renumber.

EDIT: See next item.

Edited by Geep
  • Like 2
Link to comment
Share on other sites

An update of command-line tool "buildSubtitleShaders" is now available:

Win executable

C++ source code file

With this Feb. 10 version, if you ask for help (buildSubtitleShaders -h or ?), there's an additional option shown:

-c counter start value     (default: 1). Eases merging results from different sound file folders.

As before, comments in the source code file provide full documentation.

 

  • Like 1
  • Thanks 1
Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.


  • Recent Status Updates

    • Xolvix

      Took a break from TDM until I got my backlog under control. It's been months and it's still not "under control". Damnit.
      · 1 reply
    • Ansome

      I sleep well at night knowing the player will never see the absolute nightmare that is my sealing brushwork outside the playable area. Only the pointfile can judge me now.
      · 4 replies
    • JackFarmer

      Somehow I admire the fact that the material from Dune has now been filmed for the third time. Personally, I could never do much with the material, but as a child of the 80s, I of course know the David Lynch movie...and that movie was at least funny!
      · 2 replies
    • Baal

      Episode 3 of the second best Doom 3 mod, Phobos,  was just released.
      · 6 replies
    • snatcher

      TDM Modpack v4.2 released!
      · 1 reply
×
×
  • Create New...