Jump to content
The Dark Mod Forums

Subtitles - Possibilities Beyond 2.11


Geep

Recommended Posts

I'm afraid default extension won't do any good.

Because the system does not know about conversations, it only knows about sounds.
Suppose that person A says sound "Hello!", then he also says sound "How are you!".
If you extend "Hello" far enough, then it will be displayed along with "How are you". The system won't magically guess that these two sounds are from the same speaker and thus should not happen at the same time.

UPDATE: There is a chance that the system of channels on every sounds emitter already solves this problem.

  • Like 1
Link to comment
Share on other sites

I have implemented the extension, and seems to work fine (6262).

Inline subtitle for a sound of duration = T lasts for max(T + 0.2, 1.0) seconds.
This is configurable: 1.0 second minimum taken from this thread, and 0.2 addition was taken from hardcoded delay between actions in conversations.
 

Also I noticed that while I planned subtitles to not move between slots, they indeed do move sometimes.
Also, as noticed here, high-level subtitles are not guaranteed to be displayed in presence of many low-level messages.
So with 6264, fixed this and I passed more information from sound engine.

For instance, now subtitle slots can be reused by having same "emitter"... which is supposedly very close to the concept of "who says this".
 

Also, the original Doom 3 has the following rules which affect subtitles.

Every "sound emitter" has a bunch of "sound channels". Conversation sounds go into SND_CHANNEL_VOICE channel, but ordinary barks usually occupy any free channel.
One channel can only play one sound at a time. If new sound is started on the same emitter and same channel, then the old sound is stopped and replaced with the new one.

The consequence is that no matter how long subtitles show after the sound is over, you'll never see two subtitle messages from the same actor at once: the newer one would replace the older one. But in case of barks it is well possible, since new mesage can take a different channel.

  • Like 1
  • Thanks 1
Link to comment
Share on other sites

@stgatilov, sounds so promising! I see, in the bugtracker, you've added cvars:

Quote

Committed the change in svn rev 10284.
It adds three cvars for now:
  tdm_subtitles_inlineDurationExtension = 0.2 seconds: increase inline subtitles duration
  tdm_subtitles_inlineDurationMinimum = 1 second: inline subtitles duration must be at least this
  tdm_subtitles_durationExtensionLimit = 5 seconds: extension of sound duration due to subtitles is capped by this

The default value of 200 milliseconds is the same as the one hardcoded in conversation system.
So now in a conversation with "wait until finished" talk sounds, subtitles exactly follow each other.

You also said the conversation sounds go into s SND_CHANNEL_VOICE channel. So all parties to a conversation share the same channel, and thus what should be expected is that a single subtitle slot will show all parties, alternating. (This may or may not be better than having a separate slot for each party, but I understand that this is how it is.)

I guess from what we're saying here, the conversation system doesn't support duets.

It would be interesting to see if having a brief disappearance of the subtitle field between speakers would aid comprehension. This could be tested with, say, tdm_subtitles_inlineDurationExtension = 0.1  @datiswous, you're opinion?

For conversations with a voice clip that needed a longer subtitle extensions than 0.2 (or triggers the 1 second duration minimum), "wait until finished" by itself would still (and correctly) cause the next subtitle to truncate the current one. It would have to be "wait until finished" followed by an additional conversation "wait" of fixed duration.

BTW, for my testSubtitles program, I used the narrator voice (or maybe the player, I forget which), and, in spite of being from a common source, these sound clips can overlap/superimpose without a problem, presumably in different sound channels. And the subtitles appear in different slots.

So the concept of a "sound emitter" is a subtle one.

  • Like 1
Link to comment
Share on other sites

I wonder if it would be possible to alter the Conversation process, to make it more friendly to subtitlers and conversation authors? So currently (and abstractly):

"wait until finished" is clipDuration + betweenClipSpace,

where betweenClipSpace is hardcoded to 0.2 seconds and based on reasonable audio flow in a real-life conversation.

Suppose this was changed to:

"wait until finished" is MAX(clipDuration + betweenClipSpace, subtitleDuration + betweenSubtitleSpace)

where subtitleDuration is the clipDuration including any extension (due to 1-sec rule or specific extension ask)

betweenSubtitleSpace is a temporal visual gap between subtitles appearing in the same slot. Candidate values might be 0, 0.05, 0.1, 0.15. This could be ultimately hardcoded and apply to all FMs, but maybe should be a CVAR for initial evaluation.

I think this would be closer to what @datiswous had in miind.

  • Like 1
Link to comment
Share on other sites

Related to this, sometime this week I'm going to -

- change my Excel worksheet (currently with the ongoing The Lord work) to incorporate the 1-second rule and also calculate the minimum time extension that should be requested (assuming that feature becomes available at some point).

- setup a separate local subtree for 2.12 dev builds. (This will be to receive dev TDM-installer releases. Not going the route of SVN pulls/builds). Then can do testing.

 

Link to comment
Share on other sites

13 hours ago, Geep said:

You also said the conversation sounds go into s SND_CHANNEL_VOICE channel. So all parties to a conversation share the same channel, and thus what should be expected is that a single subtitle slot will show all parties, alternating. (This may or may not be better than having a separate slot for each party, but I understand that this is how it is.)

I guess from what we're saying here, the conversation system doesn't support duets.

You understood it incorrectly 😉

Every sound emitter has its own set of channels, channels are different between emitters.
So in a conversation, actor A has one VOICE channel, and actor B has another VOICE channel. They can talk simultaneously, but one actor A cannot say two sounds at once, and the same applies to subtitles (except that you can put two simultaneous subtitle lines into .srt file I think).

 

3 hours ago, Geep said:

I wonder if it would be possible to alter the Conversation process, to make it more friendly to subtitlers and conversation authors?

How would it help?
I'm rather reluctant to allow subtitles change the behavior of existing mechanics, i.e. extend duration of conversations.


Anyway, I don't understand yet where the boundary between hardcoded and tweakable things should be.
I definitely prefer to keep things hardcoded as long as they work well enough, but you think some tweaking is necessary, we can expose some values.

Moreover, I think story-level subtitle is given more priority than various barks.


P.S. Next I'd like to try to add positional cue, and some kind of debug text output to subtitles.

  • Like 1
Link to comment
Share on other sites

3 hours ago, stgatilov said:

Every sound emitter has its own set of channels, channels are different between emitters.
So in a conversation, actor A has one VOICE channel, and actor B has another VOICE channel. They can talk simultaneously, but one actor A cannot say two sounds at once, and the same applies to subtitles (except that you can put two simultaneous subtitle lines into .srt file I think).

Ah, OK. That changes things, probably removes the case for changing the conversation's "wait until finished" behavior.

Interesting that the player/narrator sound emitter doesn't seem to have the restriction that "one actor... cannot say two sounds at once".

Yes, .srt should handle 2 lines (but I haven't done anything with .srt myself yet. Barks aren't generally long enough for that. For inline, you just embed a "\n" to get 2 lines.)

  • Like 1
Link to comment
Share on other sites

On 2/27/2023 at 1:48 PM, stgatilov said:

P.S. Next I'd like to try to add positional cue, and some kind of debug text output to subtitles.

Also, please keep on your radar changing the parsing for the "inline" command, so that it can take an additional optional parameter, I'm informally calling "extends to". This is the time in fractional seconds to show the subtitle.

(This is one way to define the parameter; another would be as the ADDITIONAL time, beyond the clip duration.)

For The Lord subtitles (currently about to start my final QA pass), I've already generated Extends To values, but marked them as "// TO DO: " comments so as not to break current parsing.

Probably you'll see those subtitles next week - since IRL the kitchen is being boxed up in preparation for remodeling.

 

  • Like 1
Link to comment
Share on other sites

9 minutes ago, Geep said:

Also, please keep on your radar changing the parsing for the "inline" command, so that it can take an additional optional parameter, I'm informally calling "extends to". This is the time in fractional seconds to show the subtitle.

As a general question regarding configurability: are you really sure this is needed?

I have added hardcoded global extension and minimum time. If you don't like these values, we can choose better values.
Since the amount of space for one subtitle is very limited, all subtitles have more or less the same text size, and hence they take about the same time to read. So why do you think some particular sound should be extended by 3 seconds while most of the others are OK with 0.2 seconds?

  • Like 1
Link to comment
Share on other sites

Considering only barks, not conversations...

For most barks where the audio is more than 1 second, subtitle display extension is neither needed nor desirable. When it's reasonable to limit the subtitle duration to just the audio duration, that's what the captioning community seems to recommend.

However, if a comparison of clip length to reading time (estimated by character and/or word count) shows that there's not enough time to read the caption (given a maximum reading rate chosen as policy), then the solutions are  -

  1. shorten the subtitles (i.e., make it non-verbatim)
  2. slightly extend the subtitle display time

So, in order to minimize the need for (1), the ability to do (2) is desirable. The amount of extension should be done on a per-subtitle basis, not by a global parameter. Hence the optional parameter.

 

Edited by Geep
typo
  • Like 1
Link to comment
Share on other sites

For The Lord's barks, out of 391 sound clips, 44 would need a time extension (so 11.3%), in order to still allow a verbatim (i.e., unshortened) subtitle with a quite-fast reading rate of 20 CPS or 240 WPM. The maximum time extension I am doing is 1/2 second beyond the end of the clip. That is twice as long as considered best practice, but permits all of those to be rendered verbatim. Except for one, that even with a 1/2 second extension had to be shortened. I mark this with the comment "// Shortened" in the .subs file.

  • Like 1
Link to comment
Share on other sites

No later than Tuesday, I'll be releasing the following -

- A spreadsheet with all The Lord subtitles. The spreadsheet calculates metrics, that help decide which subtitles benefit from extension and what those extensions individually should be. Plenty of examples there.

- A companion Word doc explaining the spreadsheet and the workflow generally. The spreadsheet is relatively complicated and so benefits from a column-by-column explanation.

- The testSubtitlesLord FM with the resulting polished subtitles embedded in it. This FM is slightly different than its Thug predecessor; the Word doc has the differences.

To come in March/April -

- My personal Style Guide for these barks

- A cleaned-up C++ program to gather sound clip durations to import into the spreadsheet.  (Current version works but is a bit too smelly to release.)

- Possibly the spreadsheet in template form

- And another AI character's utterances.

  • Like 1
  • Thanks 1
Link to comment
Share on other sites

I guess I should just implement the extension for inline subtitles, since now it is just a matter of convenience. SRT can allow you tweak the extension anyway.

Would it be good to make duration extension a parsing state like "story" / "speech" ?
If you write "inlineDurationExtend 0.5", then duration of all inline subtitles below are extended by 0.5 seconds. If you don't set anything, it is default = 0.2.

I'm not very fond of this stateful approach, but it is already used with verbosity, and I believe the main idea was to avoid copy/pasting too much stuff (especially when new parameter is added).
I hope that if the decls are kept small enough, then we won't have a problem that finding all the state variables gets too hard and error-prone.

  • Like 1
Link to comment
Share on other sites

On 3/3/2023 at 3:53 PM, stgatilov said:

I guess I should just implement the extension for inline subtitles, since now it is just a matter of convenience

I think we still differ on what that implementation should be. Sorry, I'm going to be a pain in the butt.

Quote

SRT can allow you tweak the extension anyway.

Not really true; currently, the last phrase of an SRT is still bound by the original clip duration. So, SRT would also need the ability to be extended. And the SRT author have the ability to make use of that extension when specify the last-phrase end time.

That said, I'm mainly interested in inline at this time, so, in the interest of getting that done, willing to postpone consideration of SRT extensions (say until 2.13). On the third hand, SRT will probably be used a lot more with "story" phrases, including those appearing in Conversations. So maybe it can't be postponed.

For conversations (including those that are inline), I think the default extension should be 0, not 0.2 seconds. Because it's not desirable to extend unless you really have to, as I argued earlier.

Quote

Would it be good to make duration extension a parsing state like "story" / "speech" ?

No. What I want is duration extensions that can be individualized and applied on a per-sound-clip basis. I would much prefer just specifying another individual (but optional) parameter, even if that requires more cut/paste. If instead it's necessary to reorder and group sound clips to apply extensions, this will make management of my testing list considerably more difficult.

ALTERNATIVE: I could imagine, instead of specifying an optional extension parameter, it could be calculated on-the-fly. The algorithm would count the number of characters and the approximate number of words, apply them against global max CPS and WPM settings, and generate extensions, bounded by 0.0 and 0.5 seconds by default. If you're interested in this alternative, we could pursue it further. (The CPS/WPM could be one or two user-adjustable CVars.)

  • Like 1
Link to comment
Share on other sites

On 3/6/2023 at 6:52 AM, datiswous said:

Why is subtitles not an integrated part of the hud gui code (tdm_hud.gui)?

Can only speculate. The HUD features were in TDM a long time (so mature) before they became resizable. Subtitles are still kind of a work in progress. Also, subtitles are optional, whereas the other HUD features are (with an exception or two) not. And maybe tdm_hud.gui is already complicated enough.

  • Like 1
Link to comment
Share on other sites

On 3/6/2023 at 12:52 PM, datiswous said:

Why is subtitles not an integrated part of the hud gui code (tdm_hud.gui)?

Subtitles work during main menu cutscenes (video / briefing), where no HUD is available, that's why they have to exist as a separate entity.

One reason why subtitles cannot be included into tdm_hud.gui is that they must be displayed during in-game cutscenes and any other cases when HUD is hidden.
The game code maintains a set of overlays, every overlay is a separate UI with its own root "Desktop" window. If in-game subtitles become the part of HUD, then we would have to invent some custom code to not hide HUD but hide almost all the parts of it...

  • Like 1
Link to comment
Share on other sites

The HUD and subtitle guis could be made aware of each others' layout via existing and new CVars. So if you wanted to, say, have subtitle field widths shrink as inventory icon size enlarges, you could. Not saying that's a good idea, just that it's possible.

Link to comment
Share on other sites

(Following is copied from "AI Barks" thread, regarding:

- what the subtitle field margins should be

- what the character limits should be

- making the subtitle fontsize and field size adjustable.)

@datiswous, the centering of the subtitles was the stock tdm_subtitles_common default, so I left it that way with my The Lord customization. I'm not sure centering (versus, say, left justification) is really the issue with respect to subtitles fitting in the box.

To that point, as a demonstration with testSubtitlesLord, I used rather aggressive side margins of 170 (so a field width of 300), which can just-barely accommodate 42 characters/line. But that means, depending on word breaks, some lines with 42 or 41 characters will force an autobreak, as it sounds like you observed.

While margins of 170 is the minimum required to avoid overlap with fully-enlarged inventory icons, I think such overlap is less critical than avoiding unwanted word-wrap. So going with margins of 160 & field width of 320 is better going forward, while still avoiding the egregious overlap of the stock tdm_subtitles_common.

Can you live with a character-length restriction of 42 characters/line? I think (given 2 lines) that should be adequate for almost all srt phrases that are limited to about 6 seconds.

Another approach to strike a balance between margin overlap and word-wrap would be to reduce the font scale slightly.

And in the long run, it would probably be desirable to have a CVar that the user could use to scale both font size and corresponding field height & width (and probably interfield vertical spacing). This could be shown as part of the HUD Settings special screen, even if internally it is a separate layer. Then the user could find their own compromise.

  • Like 1
Link to comment
Share on other sites

@stgatilov, another problem area is the complete overlap between the inventory pickup message field (just above the breathe bar) and the subtitle fields, particularly the lowest. I think when both a subtitle and pickup message appear, both will be very hard to read.

I can think of many solutions. Here's two of the easier ones:

  1. when subtitles are on, pickup messages are suppressed, i.e., as if you had set cvar tdm_inv_hud_pickupmessages "0".
  2. do away with the pickup message field, and just show pickup messages in the objectives/saved-games message field at top.

A variation on (1) would be, when subtitles are on, show pickup messages just as a subtitle with "story" priority e.g.:

   [acquired 80 in jewels]

A variation on (2) would be, keep the pickup message field, but reroute its messages to the objectives message field when subtitles are on.

(Hmmm, I forget where trainer messages appear. Is that also a problem?)

  • Like 1
Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.


  • Recent Status Updates

    • nbohr1more

      Hidden Hands: Blood and Metal is out
       
      · 1 reply
    • taaaki

      Apologies for the unplanned downtime. A routine upgrade did not go to plan, and the rollback had its own issues
      · 2 replies
    • freyk

      Got tdm 2.12 running on my android phone. For more info, read the latest post in the topic on subforum techsupport.
      · 2 replies
    • snatcher

      TDM Modpack v4.5 released!
      Introducing... The Loop
      · 1 reply
    • Ansome

      Taking a break to alleviate burnout. In retrospect, I probably shouldn't have jumped into a map-making contest so quickly after just finishing another project and especially with my busy schedule, but I do believe I have something that the community will enjoy. No clue if I'll be able to finish it on time for the competition if I factor in a break, but I'd rather take my time and deliver something of quality rather than engage in development crunch or lose part of the map's soul to burnout.
      · 1 reply
×
×
  • Create New...