Subtitles - Possibilities Beyond 2.11

nbohr1more · February 12

56 minutes ago, datiswous said:

Do subtitles for voice audio in non-video briefings actually work?

I have an instance where I can't get them to show up. Or I'm missing something.

Both "A New Job" and "Tears of St Lucia" use non-video briefings and have working subtitles. Perhaps one of the GUI defs in the mission is overriding something that subtitles require.

stgatilov · February 12

6 hours ago, datiswous said:

Do subtitles for voice audio in non-video briefings actually work?

I have an instance where I can't get them to show up. Or I'm missing something.

Since briefing is completely redefined by you, you need to add subtitles GUI for them to show up.
See stock mainmenu_briefing.gui.

datiswous · February 12

I added this to the included mainmenu_briefing.gui :

	//stgatilov #2454: display subtitles
	#define SUBTITLES_NAMEPREFIX Briefing
	#include "guis/tdm_subtitles_common.gui"

I thought that would do it..

Btw. this is for mission nhat. This section doesn't really need subtitles, because they are already included burned in the images, but I still want to figure out why it doesn't show up.

Edited February 12 by datiswous

Geep · February 12

On 2/10/2024 at 2:57 PM, stgatilov said:

Some GUIs use some kind of pseudographics.
Reducing the spacing also breaks such readables.

Is this really a problem? Maybe we can figure that out before reverting the "J", which previously was so badly spaced it looked like it had a space character after it.

It would be helpful to identify which missions have such pseudographics. I suspect it's a small number. And those using Stone font smaller still.

In that regard, I did some preliminary experiments, limited to a cohort of 46 FMs on my machine with expanded FM .pk4s, that I can do full-text searches across files on, with TextPad. (@stgatilov, I imagine you are in a position to similarly and more conclusively search the entire english FM collection.)

I searched with characters (or short strings) that would be likely to be found in pseudographics. For untranslated FMs, I searched in filenames of form *.xd . For translated FMs (those that had string "#str_" in their *.xd; there were 8 such FMs in my cohort) I searched in file english.lang

Results

Many search strings were useless due to too many false hits:

"_" Widely used as part of internal xd name. Used on separate line as heading underscore.
"/" Widely used as part of internal xd name.
"\" (except "\\") Widely used \n or \" escape character
"--" (assuming single dash is everywhere) Several in a row to bracket a heading, or as part of a letter's signature. Used on separate line, as heading underscore or as border between paragraphs

These search strings were moderately useful, showing up in just a few FMs (tho none were pseudographics on examination):

"=" Used on separate line as heading underscore (in penny3's erasing.xd)
"+" Used as bullet points, or (several in a row) to bracket a heading (in penny3's erasing.xd; in bcd's english.lang)

These search strings were most promising:

"|" Only 1 hit, but it's a table-form pseudographic! (northdale1's snowedinn.xd)
"\\" (i.e., escaped \) No hits found in my cohort

Inspection of snowedinn.xd showed this:

Spoiler

readables/snowedinn/candle_shop_ledger
{
   precache
   "num_pages"   : "1"
   "page1_left_title"   :
   {
       "               Sales Ledger"
       "************************"
   }
   "page1_left_body"   :
   {
       ""
       ""
       "         Item         |Gold |qty"
       "-------------------------"
       "Skull candle    | 15g    | 3"
       "Silver lamp     |20g    | 2"
       "Small candle | 8g      | 4 "
       "                             |            |"
       "                             |            |"
       "                             |            |"
       "                             |            |"
       "                             |            |"
       "                             |            |"
       "                             |            |"
       "                             |            |"
   }
   "page1_right_title"   :
   {
       ""
   }
   "page1_right_body"   :
   {
       ""
   }
   "gui_page1"   : "guis/readables/books/book_calig_mac_humaine.gui"
   "snd_page_turn"   : "readable_page_turn"
}

Note that it uses book_calig_mac_humaine.gui; so this pseudographic was not in Stone font

datiswous · February 12

13 hours ago, datiswous said:

I thought that would do it..

Ok, so it turned out to be just a stupid mistake in my fm_root.subs file. All is fine.

stgatilov · February 13

When I actually meant pseudographics, I did not mean some specific line/dots/crosses.
I simply meant the cases when people add spaces until the text gets the exact alignment they want.

You can't search for such cases, because the only offending character is space, which is present everywhere.
The only way to proceed to just change spacing and listen for feedback afterwards.
And it sounds like an OK solution for me in general, but definitely not a good idea for 2.12.

Geep · February 14

OK, I understand. I will revert the spacing changes. But I'm looking at some other issues in that DAT file, so won't happen until around next week.

snatcher · February 14

23 hours ago, stgatilov said:

You can't search for such cases, because the only offending character is space, which is present everywhere.

I don't know what the target format looks like but we could use regular expression, I guess?

Quote followed by one or more spaces..

Quote followed by something that isn't a space followed by more than one space...

Geep · February 16

It's hard. I see what you are thinking about is a "bullet point" pattern, with or without a bullet point character. For that case, I suppose what we are trying to detect is:

- the line starts with the bullet point pattern

- a "J" character (say) with a shortened spacing, is somewhere in the line.

- the line itself in the xd file is long enough to force a word wrap (let's just think about a single wrapping here)

- the author assumed a particular word wrap point, and put in extra spaces in mid-line, immediately at the wrap point, so as to indent the latter part of the line, aligning with the indent of the first part.

- the shortened-spacing J screws up the resulting alignment.

Probably would need a script/program to winnow this down... ideally, that algorithm could also restrict attention to just Stone font. I'm OK with putting such an effort off for now, maybe revisiting it for 2.13.

Geep · February 21

Here is another update to the English Stone font's DAT file used for subtitles and some readables:

fontImage_24.dat

This supercedes the Jan 30th update. As agreed, now all character spacings - as given by xSkip - are preserved (from this file in TDM 2.11). Exception: the Jan 30th repair of garbage metadata for "<" and ">" remains, including xSkip repair.

ASCII Characters (lower 128). The earlier Jan 30th post summaries those ASCII characters that needed metadata changes to avoid adjoining stray marks. (The detail report below updates newer reversions and minor revisions.)

ANSI Characters (upper 128). An analysis was also made of the status of the Stone 28 pt font's characters in the upper codepoint range of 128-255. This could be of interest if the subtitle system was some day expanded to include European languages, and continues to use the historic codepage method. The analysis also prompted some additional DAT tweaks now.

Broadly, implementation of the upper-range characters (standard or TDM-specific, as defined in the TDM wiki's I18N - Character mapping I18N) is incomplete for Stone 24 pt. The status is:

43% (55 chars) Good as is.
9% (12 chars) Good enough after DAT tweak included in this update.
6% (7 chars) Missing and shown as hollow box.
30% (38 chars) Missing accent/diacritic.
7% (9 chars) Otherwise weird. But often suggestive of glyph work started but not completed.

In addition, 7 chars within categories (3-5) were "improved", but are still not good. To really solve categories (3-5) requires DSS bitmap surgery (and corresponding DAT adjustments), which is beyond the scope of planned work.

DAT tweaks of ANSI characters (like with ASCII) were careful to avoid changes to xSkip. Tweaks can be further grouped by problem solved....

In category (2):
- (4 chars) Char is clipped, with stray mark from adjoining character on other side.
- (7 chars) Stray mark, without char clipping.

As improvements in categories (3-5):
- (3 chars) Stray mark, without char clipping
- (4 chars) Out of valid range on top edge

A categorized itemization about treatment of specific problems and characters, with further details, is here:

Spoiler

Changes to 2.11 english Stone fontImage_24.dat
Feb 21, 2024 by Geep

Within This Report
================
- TDM chars shown herein are portrayed using UTF-8, not ANSI.
- Changes to s, s2, t, t2 are described in pixel units. In terms of float values in range [0.0 .. 1.0], 1 pixel = 1.0/256.0

ASCII Fixes, also done earlier for Jan 30th DAT update
===============================================

Problem: Stray mark from adjoining character seen to left. Right side of character is not clipped.
Solution: move s to the right, incrementing 1 pixel, and compensate by decrementing pitch & imageWidth.

   char 37 (0x25) %
   char 67 (0x43) C
   char 69 (0x45) E
   char 84 (0x54) T
   char 87 (0x57) W
   char 89 (0x59) Y
   char 124 (0x7c) |   [Jan 30th change to xSkip is reverted][s move & imageWidth changed by 1 as described, but pitch decremented by 3 for better (though suboptimal) spacing]

Problem: Poor spacing of J
Prior Solution: Earlier, for the Jan 30th DAT update, both pitch and xSpace were altered. The J descender could go under the preceding character on the line. But there was then concern about xSpace changes affecting existing readables.
Current Solution: the xSpace change is reverted, and instead the pitch decrement made larger (to 3). This is suboptimal, so maybe revisit possibility of xSpace change for TDM 2.13, or shift pixels in bitmap. Since J is at left edge of 256x256 bitmap, this limits shifts in s & s2.
char 74 (0x4a) J

Problem: Garbage image box
Solution: Redo based on bitmap (including xSkip)

char 60 (0x3c) <
char 62 (0x3e) >

ANSI Fixes, done in this Feb update
===============================

Fully Fixed Chars
--------------------

Problem: Stray mark from adjoining character seen to left, and right side of character is clipped.
Solution: move both imagebox x coordinates s & s2 to the right, incrementing 3 pixels

Shifted right 3:
char 169 (0xa9) Ů

Problem: Stray mark from adjoining character seen to right, and left side of character is clipped.
Solution: move both imagebox x coordinates s & s2 to the left, decrementing 3 or 5 pixels

Shifted left 3:
char 131 (0x83) Ż
char 132 (0x84) Ź

Shifted left 6: [a shift of 7 would space better, but then show stray marks to left. Probably bitmap surgery if want better still.]
char 180 (0xb4) Ž [Also decremented pitch by 3, to benefit spacing]

Problem: Stray mark from adjoining character seen to left. Right side of character is not clipped.
Solution: move s to the right, incrementing 1 or 2 pixels, and compensate by decrementing pitch & imageWidth.

Move s 2:
char 154 (0x9a) ǔ

Move s 1:
   char 164 (0xa4) ű
   char 184 (0xb8) ž   [Also decrement pitch by 2, to benefit spacing]
   char 205 (0xcd) Í
   char 232 (0xe8) è
   char 233 (0xe9) é
   char 250 (0xfa) ú

Problem: Poor spacing of Æ
Solution: decrement pitch by 3 [Good enough, if still not optimal. Didn't try shifting s, s2.]
char 198 (0xc6) Æ

Improved Chars, but Still Not Good
-------------------------------------------

Problem & Solution: Same stray marks to left as just mentioned above.
Problem remaining: These characters are also listed below as missing their accent/diacritic.

Move s 1:
   char 172 (0xac) Č
   char 190 (0xbe) Ÿ
   char 203 (0xcb) Ë.

Problem: Upper image boundary is negative, so out of range (t = -7). Glyphs that "ran off" bitmap had row 0 pixels replicated upward [this may vary with graphics driver], so were weird looking.
Partial, Temporary Solution: Set t to zero, and reduce "height", "imageHeight, & "top" by 7. So, these are now just clipped on top.

   char 140 (0x8c) Ń [Note A]
   char 152 (0x98) ô [Note B]
   char 153 (0x99) ŕ [Note B]
   char 240 (0xf0) ð [Note C]

   [Note A] After fix, this char is listed below as shown with hollow box.
   [Note B] After fix, this char is listed below as missing its accent.
   [Note C] After fix, this char is listed below as weird.

Problems due to Bitmap Content. No Repair Undertaken
==================================================

Glyphs with Missing Accent:

   char 133 (0x85) Ŝ
   char 134 (0x86) Ĉ
   char 135 (0x87) Ẑ
   char 136 (0x88) Ô
   char 137 (0x89) Ŕ
   char 138 (0x8a) Ǔ
   char 139 (0x8b) Ă
   char 151 (0x97) ẑ
   char 153 (0x99) ŕ     Clipped against top of bitmap
   char 155 (0x9b) ă
   char 162 (0xa2) Ű    (also standin for Hungarian Ü)
   char 165 (0xa5) Ě
   char 170 (0xaa) Ą
   char 171 (0xab) Ę
   char 172 (0xac) Č      [Note 2]
   char 176 (0xb0) Ő     (see also: Hungarian Ö at TDM codepoint 214)
   char 178 (0xb2) Ť
   char 179 (0xb3) Ď
   char 182 (0xb6) ť     [Note 3]
   char 183 (0xb7) ď
   char 190 (0xbe) Ÿ     [Note 2]
   char 192 (0xc0) À
   char 199 (0xc7) Ç
   char 200 (0xc8) È
   char 202 (0xca) Ê
   char 203 (0xcb) Ë     [Note 2]
   char 204 (0xcc) Ì
   char 206 (0xce) Î
   char 207 (0xcf) Ï
   char 208 (0xd0) Ð     Missing accent is horizontal bar
   char 209 (0xd1) Ñ
   char 210 (0xd2) Ò
   char 212 (0xd4) Ô
   char 213 (0xd5) Õ
   char 217 (0xd9) Ù
   char 219 (0xdb) Û
   char 221 (0xdd) Ý
   char 255 (0xff) ÿ     [Note 3]

[Note 2] This char had a stray mark to its left from an adjoining character. That problem was fixed above, but the missing accent was not addressed.

[Note 3] This applies to TDM European but not TDM Russian codepoints, where FF works with B6 to show я.)

Glyphs with Other Problems. Charitably, Work Started but Not Finished:

   char 152 (0x98) ô     Circumflex badly rendered as small square
   char 167 (0xa7) §     Wrong glyph, shows 8, probably as starting point
   char 168 (0xa8) š     Adequate, though accent only so-so
   char 188 (0xbc) Œ    Wrong character, shows D. Maybe plan to flip it?
   char 189 (0xbd) œ    Wrong character, shows o (start of work?)
   char 191 (0xbf) ¿    Shows ? instead. Maybe plan to flip it?
   char 222 (0xde) Þ     Wrong char B (start of work?)
   char 240 (0xf0) ð. Wrong character, o (prob start of work). Clipped against top of bitmap.
   char 254 (0xfe) þ. Wrong character B (maybe starting glyph to convert)

No Glyph. Shown as Hollow Box:

   char 140 (0x8c) Ń    Clipped against top of bitmap
   char 141 (0x8d) Ș
   char 142 (0x8e) Ț
   char 144 (0x90) đ
   char 157 (0x9d) ș
   char 158 (0x9e) ț
   char 173 (0xad) Soft hyphen (SHY). Shown as hollow box. Better to show as regular hyphen?

Information about methods, including new tools, to conduct this analysis and tweaking will be forthcoming, mostly after the 2.12 release.

Geep · February 24

Oops, in that last release, I accidentally didn't check character spacing in the codepoint range 128-159. And sure enough, there were problems with character 131 & 132, which needed 4 more pixels of horizontal box shifting than the 3 pixels they got earlier. So here's a better version:

fontimage_24.dat of Feb 24

Geep · February 25

Here is the release of testSubtitlesANSI , one of my tools to test and improve the Stone 24 pt font that is used in all the English subtitles.

testSubtitlesANSI.pk4 of Feb 25, 2024

As the screenshot below reveals, this testing tool is an FM that's a variant of that used to develop subtitles for individual AIs. Here, the subtitles are just alphabetic lists. There are 7 subtitles that cover all the TDM 8-bit codepoints (e.g., 0-255), including those representing European languages. The first two subtitles (i.e., codepoint sub-ranges) are shown in this screen shot. The tool allows inspection for stray marks and inter-character spacing; black walls & floor facilitate that. This release includes the most recent improvement to these attributes, provided in yesterday's update of the fontimage_24.dat file. (The changing of those attributes was done with a different tool, used iteratively with this one.)

snatcher · February 25

Uppercase C, E, T, W, Y are way better now. See if you can review capital G.

Geep · February 26

16 hours ago, snatcher said:

Uppercase C, E, T, W, Y are way better now. See if you can review capital G.

I just looked at capital G. I didn't see any stray marks. The spacing to its right was a bit larger than required, but I didn't think grossly so. Is that what you were referring to?

snatcher · February 26

Don't ask me why but artifacts are more evident when using my spin off the gui file.

I noticed C, E, T, W, Y improved in your latest fontimage_24.dat and I wondered if you overlooked G.

Geep · February 27

OK, I poked around a little further. At the moment, beta5 is what I have installed.

testSubtitlesANSI ships with an override of tdm_subtitles_message.gui, which allows easy adjustment of textscale.

The shipped textscale is set to 0.24. With that, the stray marks left of G are not visible. If you change it to 0.25, they are visible; which is evidently what your custom gui is also using. If you setaside any custom tdm_subtitles_message.gui, then the core one goes into effect. Which, for beta 5, uses a textscale of 0.24

I don't know whether we will be going with 0.24 or 0.25 for the release. Nevertheless, now that I can see the stray marks, I'll see about fixing them.

BTW, with your gui, the black outline shadows have the effect of enlarging the stray marks as black smudges. Helpful for diagnosis.

Geep · February 27

Yet another Stone 24 pt Update.

This removes stray marks to the left of G (as well as char 249, u with accent grave), visible with textscale 0.25 but not 0.24. So this may or may not benefit subtitles, but it will definitely help Stone font readables. Most of them use a scale of 0.25 for body text.

An attempt was made to also improve G spacing, but (since xSkip can't be changed) effect is marginal at best.

fontImage_24.dat of Feb 27

snatcher · February 27

2 hours ago, Geep said:

The shipped textscale is set to 0.24. With that, the stray marks left of G are not visible. If you change it to 0.25, they are visible; which is evidently what your custom gui is also using.

This explains why were seeing different things.

2 hours ago, Geep said:

[...] core one goes into effect. Which, for beta 5, uses a textscale of 0.24

And for beta 7.

2 hours ago, Geep said:

BTW, with your gui, the black outline shadows have the effect of enlarging the stray marks as black smudges. Helpful for diagnosis.

Yeah, more power

50 minutes ago, Geep said:

Yet another Stone 24 pt Update.

Well done. The most common characters are pretty clean now regardless of the textscale.

51 minutes ago, Geep said:

Yet another Stone 24 pt Update.

[...] but it will definitely help Stone font readables. Most of them use a scale of 0.25 for body text.

I don't recall noticing artifacts in readables but then the text normally is dark and stray marks most likely blend in with the irregular backgrounds.

1 hour ago, Geep said:

An attempt was made to also improve G spacing, but (since xSkip can't be changed) effect is marginal at best.

The only letter I could detect where spacing is noticeable is: J ack J erk J iffy J oy J uice.

Geep · February 27

2 hours ago, snatcher said:

The only letter I could detect where spacing is noticeable is: J ack J erk J iffy J oy J uice.

Yeah, it's J anky, but can't be fixed in the DAT file. Likely it requires moving pixels in the bitmap. I'm personally unenthused about doing that, particularly in a 2.12 timeframe.

Geep · March 2

I have just released a new utility, "refont", as an open-source, partial-successor of the traditional "q3font". For the whole story, see the new wiki page Refont.

Subtitles - Possibilities Beyond 2.11

Recommended Posts

Link to comment

Share on other sites

Top Posters In This Topic

Popular Days

Top Posters In This Topic

Popular Days

Popular Posts

stgatilov

stgatilov

snatcher

Posted Images

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Join the conversation

Recent Status Updates