Jump to content
The Dark Mod Forums

Geep

Contributor
  • Posts

    1185
  • Joined

  • Last visited

  • Days Won

    61

Posts posted by Geep

  1. Intro to Air Pocket’s Readables

    Air Pocket has just two readables, with shared and duplicated pages; there are really only two unique pages. One is titled “Appointment to Service”, and represents printed, rather draconian boilerplate given by the ship captain to every hired seaman. The other, titled “Further instructions”, is handwritten and specific to our hero and his lady. This appears as a readable by itself, but is also collated (as page 2) in a bundle and nailed up with two copies (pages 1 and 3) of the “Appointment to Service”, for Thief and Emily as new hires. These immobile readables appear at the outset of the FM, which makes iterative localization testing easier. They merely set the scene and have no game hints, so no worries about spoilers here.

    Because these are not mobile readables, there is no need for inventory names with associated #str_ . (If they were mobile, the translations of the titles could serve as inventory names.)

    In this post, I’ll discuss the AI translation, saving the largely (but not entirely) post-translation text-expansion issues for the next post.

    Prompt Engineering for Readables

    When feeding to AI, the two pages can reasonably share a prompt, but otherwise be treated independently; there is no compelling reason to prompt them as story flow.

    I am skipping asking ChatGPT to do the back-translation, because I don’t trust it to not cheat. Just put a tag at the end of each line, “//uv”, for unverified. Perhaps I’ll ask another AI to back-translate later.

    A draft reusable template follows. It describes the overall translation task, the desired tone, and input and output formats. Text specific to readables is shown within the next section in bold.  Text that is specific to this FM, to clarify the context and the meaning of particular words, is in italics.

    Input to ChatGPT

    You are an expert translator between English and other European languages, including Russian. You wish to translate a list of pages from English to these other languages. In the list, each page is shown by two consecutive lines: the title text of a page followed by the body text. Each line of the list begins with a tab, then a word beginning with #str_ within double quotes, then another tab, then a phrase within double quotes. When you translate a line, in the output keep the tabs and the #str_ word unchanged, and only change the phrase to the other target language in UTF-8, keeping it in double quotes. Avoid generating “Unicode combining characters”. Try to keep the character count (including spaces and punctuation) in each translated phrase not too much longer than the original English, while preserving the formal meaning. Specifically, for title text, try to keep character count under 50% longer, and use a title style of word casing. For body text, try to keep the character count under 30% longer. Before, between, or after sentences, one or more “\n” may appear, indicating a line break; retain these in the translation. Avoid modern slang. Old-fashioned and historic nautical wording is fine. At the end of each line, add another tab followed by the fixed text "//uv", indicating unverified.

    There are two pages. They relate to a fictional voyage, several centuries ago, of a small sailing merchant ship Esmeralda.

    The first page is a printed boilerplate notice, given by Esmeralda’s Captain Riggs to each seaman hired for a voyage. It is effectively a contract, written with lots of words in the body capitalized to make it seem more official and legal. Parts of the text are designed to be somewhat threatening, as if this hiring was aboard a big government naval vessel, not just a merchant ship.

    The second page is an informal handwritten note, also by Captain Riggs, with more specific instructions to two hired crew members.

    On all pages, leave “Riggs” and “Esmerelda” unchanged during translation.

    Following the input list of pages, append output lists in these target languages: 1. German 2. French 3. Polish 4. Italian 5. Spanish 6. Portuguese 7. Russian 8. Czech 9. Hungarian 10. Dutch 11. Slovak 12. Danish 13. Swedish 14. Romanian 15. Turkish 16. Catalan.

    List of pages:
        "#str_fm_airpocket_xd_sheet_appointment_to_service_pg1__title"    "Appointment to Service\n"
        "#str_fm_airpocket_xd_sheet_appointment_to_service_pg1_body"    "\n\nHEAR YE ALL, that the Bearer is found Fit to Serve, and appointed to the Rank of Ordinary Seaman, aboard the Merchant Ship Esmerelda, by Authority of the Shipmaster, Captain Riggs.\n\nThe Appointee serves Under the Direction and at the Pleasure of the Captain and Officers. By the Law of the Sea known by all, while Aboard, Violations of Duty will be Judged & Punished by the Captain and Those carrying out his Orders. Thievery, Mutiny, Assault, and Other Gross Breeches of Order will be met by Flogging or Death. At Successful Voyage End, the Appointee upon Discharge will receive the Agreed Wage recorded in the Ship’s Log, less any Deductions.\n"
        "#str_fm_airpocket_xd_sheet_appointment_to_service_pg2__title"    "Further Instructions\n"
        "#str_fm_airpocket_xd_sheet_appointment_to_service_pg2_body"    "\n\n\n\nAs agreed, the 2 new crewmates will bunk in the galley, and share the sea trunk nearest the door in the mess.\n\nCaptain Riggs\n"

    Output

    ChatGPT broke up the results into frames by language again, but I just noticed that, beyond the end of the output, there is an overall “copy” button. So I don’t really have to copy each language individually. But still in either case, I’m post-copy editing to put separating headers in the proper form, e.g. [German].

    The first time I did this, in late evening, it did the whole gulp very fast. The second time, with a more-complete prompt (shown above), at noon time, the system was evidently being hammered (or maybe just a policy slowdown). It typed out slowly and paused several time, asking me to hit a Continue Generating button. Frames where those pauses occurred got messed up in the overall copied output, so I had to recopy those individual frames, which were OK.

    Also, ChatGPT applied a different text color highlighting this time. Doesn’t matter.

    At first glance (and without back-translation verification yet), results looked pretty plausible. Interesting that where the text mentions "Captain Riggs", for most languages the "Captain" was left in English.

    Next, I’m going to add the new preliminary readables translations to all.lang and generate the *.lang, for the purpose of estimating and testing text scaling.

     

     

    • Like 1
  2. A Plan for Readables with AI Translations

    I’ve created a separate forum thread, "Proposal - Fade-In Translations for Readables", for a new style of readable more adaptable to translations. This would require engine changes, but can be faked in the short term. I plan to create custom .gui to do this for Air Pocket. And if the proposal wins favor and moves to timely implementation, to revise Air Pocket to use the new system.

    I'ill be pausing my posts here for a moment to build out that other thread, then come back to explore the actual AI translations for Air Pocket readables.

    • Like 1
  3. Testing Air Pocket with Just English Strings

    Why? So far, English is the only language with all #str_ done; the rest have only inventory items and shouldered names done. Preliminary testing of the just the English version is mainly because of all the manual renaming. If #str_ were generated automatically with better naming, this could be largely skipped.

    How? The [English] part of our emerging all.lang was copied out and became the heart of a replacement strings/english.lang. I didn’t have to bother with gen_lang processing. This works because English is restricted here to ASCII and that’s common to both all.lang [UTF-8} and english.lang [Windows-1252].

    To speed testing, I added the Master Key at startup (in DR, entity key_master gets inv_map_start 1).

    The game was then played through in English. This is a short FM, with just a few readables and messages that don’t really change with difficulty level, so this is viable. The briefing has to be eyeballed once, and the objectives list a few times during play, to ensure all is well.

    Results? As expected, a few typos were introduced when editing #str in .map, .xd, .lang. files, so needed correcting.

    I have some thoughts about strategies for the upcoming testing in the other languages, as well as in larger FMs, but later for that.

     

    • Like 1
  4. The Problem

    Readables are available in a wide range of TDM bitmap fonts. Unfortunately, the majority of these fonts lack non-ASCII glyphs for European languages, and it would be a prohibitively lengthy task to craft them. This is one of several translation hurdles. (Another is soliciting, organizing, and distributing the work of human translators; see “AI for Translations: An Exploration” for work on an alternative. This also promotes the use of meaningful alphanumeric #str_ IDs - possibly automatically generated – instead of traditional numeric.)

    A Proposed Solution

    Suppose that when a particular page of readable is shown, it is shown first is English, with the mapper-specified font (e.g., Camberic), and then, after a number of seconds, shown in the current user-selected language, with a different font (e.g., Stone), one that offers the needed diacritics? And with the translated font size scaled down to accommodate potentially more-lengthy translated strings?

    Both the English and translated text can be viewed in sequence. That opens the door to “quick and dirty” default translations, e.g., machine translation. In particular, the reader may sometimes be able to work-around any layout problems and sub-optimal translation by consulting the English text.

    (Nevertheless, default translations may sometimes miss subtle nuanced hints, so the ability to improve them with tweaked text is a necessity.)

    The Proposed Mechanics

    Recall that the game engine currently passes these values to a readable’s gui:

    • gui::title
    • gui::body

    With a multipage readables, the content of these parameters changes as pages are flipped.
    For clarity, it is proposed to replace them with:

    • gui::titleEnglish
    • gui::bodyEnglish
    • gui::titleTranslated
    • gui::bodyTranslated

    The latter 2 would be just like gui::title and gui::body, except that they would serve empty strings when the current language is English or there is no translation available in the current language (and so be used for gui code program logic, to suppress a transition). Observe that this behavior does not substitute an English string for a missing non-English string.

    Skip the remainder of this section if details are not of interest.

    Each stock readable .gui would need a one-time conversion to use them. Instead of the traditional 2 winDef overlays for text, there would be 4, corresponding to the 4 text-passing parameters just mentioned. This allows the translated text to fade in while the English text fades out, when an onTime event starts the transition (at 2 seconds in this example). Here is the fragment of .gui code that has been altered:

    Spoiler

     

    ...
    #include "guis/readables/readable.guicode"
    // readable.guicode already defines READABLE_FADE_TIME as 200
    // Could be moved to guicode...
    #define READABLES_TRANSLATE_FADEIN_START 2000
    #define READABLES_TRANSLATE_FADEIN_DONE 2200
    
    windowDef Contents
    {
        ...
        float hasTranslation 0
        float translationShown 0
    ...
    
        windowDef titleEnglish
        {
                WORLD_SCALE
                rect 30, 56, 260, 310
                forecolor 0, 0, 0, 0 //1st show before fadein. keep at 0,0,0,0
                font "fonts/camberic"
                text "<title>"
                textscale 0.4
        }
    
        windowDef bodyEnglish
        {
                WORLD_SCALE
                rect 30,56, 260, 310
                forecolor 0, 0, 0, 0
                font "fonts/camberic"
                text "<body>"
                textscale 0.31
        }
    
        windowDef titleTranslated
        {
                WORLD_SCALE
                rect 30, 56, 260, 310
                forecolor 0, 0, 0, 0 //1st show before fadein. keep at 0,0,0,0
                font "fonts/stone"
                text "<title>"
                textscale 0.33 // Keep font smaller than original for languages with more chars per sentence.
        }
    
        windowDef bodyTranslated
        {
                WORLD_SCALE
                rect 30,56, 260, 310
                forecolor 0, 0, 0, 0
                font "fonts/stone"
                text "<body>"
                textscale 0.24 // Keep font smaller than original for languages with more chars per sentence.
        }
    
        onTime 0
        {
            ...
            set "titleEnglish::text" "$gui::titleEnglish";
            set "bodyEnglish::text" "$gui::bodyEnglish";
            set "titleTranslated::text" "$gui::titleTranslated”; // Differs from 2.12 & gui::title in that empty if no translation available for current language.
            set "bodyTranslated::text" "$gui::bodyTranslated"; // likewise
            set "hasTranslation" 0; // until we know better
        }
    
        onTime 10 // Needs to be separate block from onTime0 due to early evaluation issues
        {
            if( "titleTranslated::text" == 1.0 || "bodyTranslated::text" == 1.0 ) // 1.0 if non-empty
            {
                set "hasTranslation" 1;
            }
        }
    }
    windowDef ContentsFadeIn
    {
        notime 1
        onTime 0
        {
            transition "titleEnglish::forecolor"          "0 0 0 0" "0 0 0 0.85" READABLE_FADE_TIME;
            transition "bodyEnglish::forecolor"       "0 0 0 0" "0 0 0 0.85" READABLE_FADE_TIME;
            set "Contents::translationShown" 0;
        }
    
        onTime READABLES_TRANSLATE_FADEIN_START
        {
            if ( "Contents::hasTranslation" )
            {
                // Fadeout...
                transition "titleEnglish::forecolor"          "0 0 0.1 0.95" "0 0 0 0" READABLE_FADE_TIME;
                transition "bodyEnglish::forecolor"       "0 0 0.2 0.75" "0 0 0 0" READABLE_FADE_TIME;
                // Fadein....
                transition "titleTranslated::forecolor"          "0 0 0 0" "0 0 0 0.85" READABLE_FADE_TIME;
                transition "bodyTranslated::forecolor"       "0 0 0 0" "0 0 0 0.85" READABLE_FADE_TIME;
            }
        }
    
        onTime READABLES_TRANSLATE_FADEIN_DONE
            {
            if ( "Contents::hasTranslation" )
            {
                set "Contents::translationShown" 1;
            }
            }
    }
    
    windowDef ContentsFadeOut
    {
        notime 1
        onTime 0
        {
            // Fadeout...
            if ("Contents::translationShown" == 0)
            {
                transition "titleEnglish::forecolor"          "0 0 0.1 0.95" "0 0 0 0" READABLE_FADE_TIME;
                transition "bodyEnglish::forecolor"       "0 0 0.2 0.75" "0 0 0 0" READABLE_FADE_TIME;
            }
            else
            {
                transition "titleTranslated::forecolor"          "0 0 0.1 0.95" "0 0 0 0" READABLE_FADE_TIME;
                transition "bodyTranslated::forecolor"       "0 0 0.2 0.75" "0 0 0 0" READABLE_FADE_TIME;
            }
        }
    }

    Aspects of this Design – Transition from English

    The transition is timed, so no extra “Translate” button is shown, nor a hard-to-come-by hot key required. If you want to see the English again, you would briefly navigate away from the page to another, then return; or, if a single-page, close and re-open it.  A simple implementation (as the code above and example below) uses a fixed, hard-coded time.

    Alternative Mechanism. At some cost to code clarity, it is probably possible to get by with just the 2 normal text-passing parameters (gui::title and gui::body) and their traditional 2 overlays, though additional variable(s) would be needed for tight time-synchronization between engine and gui; and overlapping fade-in/fade-out between English and translation would not be possible.

    Advanced Version. In the longer term, timing could be made more flexible, by passing it as parameter from the engine, e.g.:

        “gui::transitionTime”

    Where does this value come from? While it could somehow encoded into the .xd file by the mapper, I prefer a different approach. Have the engine calculate it from character or word count of the body, with user-specified globals for reading rate and min and max bounds, e.g.:

        sys_readablesWordsPerSecTransTime
        sys_readablesMinTransTime
        sys_readablesMaxTransTime

    A drawback of a timed transition is that additional reading time is needed to get to the translations, which may, with immobile readables, increase risk of discovery by guards. So having these additional user controls would let a user get to the translations faster, even skip the English entirely by setting bounds to zero.

    A Simulated Example – FM “readableTranslationFadeIn”

    In the absence of engine support for the 4 text-passing parameters, it is still possible to make an approximately-functional mockup using some hard coding. However, this prototype DOES NOT suppress the transition when the current language is English. That is, it shows (rather than prevents) an English-to-English transition with change of font & font-scale.

    TDM with the languages set to “Francais” (French). The first screen shot shows page 1 of a 3-page scroll, momentarily displayed in English with Camberic title and body.

    translation_fadein_page_1_English.thumb.jpg.13050b96cc39ec21cf6965174417c429.jpg

    After a few seconds, it transitions to the second screen shot, in French in Stone font. With accents.

    translation_fadein_page_1_French.thumb.jpg.dd0e832cfec32ce3eadc8edb2a99541f.jpg

    While shown here as a scroll, this approach should be easily adaptable to books and sheets.

    About the Example’s Implementation

    The screen shots are from a prototype FM: readableTranslationFadeIn

    Notable files are:

    • guis/readables/scrolls/scroll_calig_camberic.gui, a custom override of the standard Camberic scroll readable, with the translation transition mechanism from above, plus additional simulation fakery described below.
    • strings/all.lang, a UTF-8 file containing 6 #str_ (2 per scroll page – title & body) in each language section. Only the [English] and [French] sections were implemented. The English example content was loosely derived from the St. Lucia FM. The English text (without #str_ structuring) was manually converted to UTF-8 French using Google Translate (website, not API).
    • strings/english.lang & french.lang. These were generated from all.lang using my gen_lang_plus program to create the 8-bit “ANSI” versions as required, e.g., ISO-8859-15 encoding for French.
    • xdata/readableTranslationFadeIn.xd, that contains the #str_IDs for the 3 scroll pages.

    Within scroll_calig_camberic, this simulation had this fakery:

    • “gui::title” and “gui::body” were stand-ins for hypothetical parameters “gui::titleTranslated” and “gui::bodyTranslated”;
    • The English text was hard-coded, and the appropriate content selected by actual parameter “gui::curPage”, to make up for missing hypothetical parameters “gui::titleEnglish” and “gui::bodyEnglish”.

    The READABLE_FADE_TIME is currently set to 2 seconds for testing. Probably 5-6 seconds would be better during game play.

    Aspects of the Design – Font Scaling

    As mentioned earlier, the translated font is scaled to make the text smaller than the original, to accommodate languages that need more room. A simple implementation (like in the example code) uses fixed values with “textscale”. So the textscale for the two Translated winDef overlays is smaller than for the 2 English winDef overlays. Specifically, in the example GUI code, the text scaling factors from the original Camberic readable were retained:
                textscale 0.4 // titleEnglish
                textscale 0.31 // bodyEnglish

    and supplemented by (with a different font, namely Stone):
                textscale 0.33 // titleTranslated
                textscale 0.24 // bodyTranslated

    The goal is to keep the rendered text smaller than the original English rendering for languages with more characters per sentence. These values, while hard-coded, will differ across readables (due to different starting fonts), and would need to be experimentally determined.

    But this treatment, with just a fixed scaling value that is independent of both text content and current language, is unlikely to be very satisfactory. Better ideas, needing additional engine modifications, will be considered in a follow-on post.

    Additional Considerations

    When Authoring the XD File. Recall that TDM is relatively inflexible when using #str_ within an .xd file. So this form will not work:
        "page1_body"    :
        {
            ""
            ""
            "#str_fm_scroll_camberic_pg1_body_parish_inspection_excerpts" 
        }
    Instead use
        "page1_body"    : "#str_fm_scroll_camberic_pg1_body_parish_inspection_excerpts"
    With the 2 leading linebreaks moved into the #str content as leading \n\n.

    When Testing. If there is a mismatch between the TDM Language setting and the PC’s language setting (e.g., under Windows), then some characters may turn out wrong or indicated as missing (e.g., as boxes). The degree will vary by language, and is unlikely to be seen in the initial English render (because that’s almost all in ASCII, common to all the ISO encodings.) Even with such mismatches, the translation can be reviewed as to overall length and where linebreaks occur.
    Be aware that direct editing of *.lang files is not recommended, and could risk converting from a particular “ANSI” raw 8-bit encoding into “UTF-8”.

    Applying this Technique More Broadly. A few fonts have oddball glyphs for certain characters, e.g., a skull and crossbones in Treasure Map. This would require special handling during translation.

    For Briefings, Objectives, and Messages, similar approaches can be conceived. However, for each of these (and different from readables), only one particular font is routinely offered. And there are alternative designs to be considered. For instance, the English and Translated text could be shown simultaneously side-by-side in various ways, instead of sequentially. The Objectives have the additional complication that the font size is already user-adjustable.

     

    • Like 3
  5. Yet More #str_ Added – Shouldered Names

    The I18N.pl script does not automatically localize characters’ “name”, “shouldered_name” and “shouldered_name_dead”. And that’s generally the right call. You don’t necessarily want to translate character names. For instance, I prefer my character “Emily” be left that way, i.e., not rendered as for instance “Émilie” in French.

    (And I’m unclear whether the “name” spawnarg should ever get the #str treatment. So really just looking at “shouldered_name” and “shouldered_name_dead”.)

    But I do have 3 characters that have not just name, but also title/rank, and it might be worthwhile to translate the title/rank, along the lines:

        "#str_fm_map_shouldered_name_capt_riggs"    "Capt Riggs"
        "#str_fm_map_shouldered_name_first_mate_logah"    "First Mate Logah"
        "#str_fm_map_shouldered_name_second_mate_chaf"    "Second Mate Chaf"

    By story design, only Logah is actually shoulderable, but for completeness I’ll do all three.

    I’m using the same #str_  for both “shouldered_name” and “shouldered_name_dead”.

    BTW, an argument could be made that the character #str_ renames should include the class, e.g., ..._atdm_ai_townsfolk_wench_..., ..._atdm_env_ragdoll_guard_thug_, etc. However, a class name, while explaining the visual appearance, can be somewhat remote from the precise role that the AI plays in the FM.

    Prompt to ChatGPT

    With Air Pocket specifics in italics.

    You are an expert translator between English and other European languages, including Russian. You wish to translate a list of crew members on a small historic sailing ship, from English to these other languages. Each line of the list begins with a tab, then a word beginning with #str_ within double quotes, then another tab, then a phrase within double quotes. When you translate a line, in the output keep the tabs and the #str_ word unchanged, and only change the phrase to the other target language in UTF-8, keeping it in double quotes. Make the translated phrase reasonably short while preserving the formal meaning. Avoid modern slang. Old-fashioned wording is fine. At the end of each line, add another tab, the fixed text "//bt: ", and then a back-translation of the previously translated phrase into English again. When back-translating, ignore the original English phrase.

    The crew members names are Riggs, Logah, and Chaf, all males. Keep the name-portion unchanged, but translate the title-portion. Some centuries ago, they were on a small coastal sailing ship, trading in goods along the coast. In that time frame, take account of different countries having different merchant marine titles aka ranks. "Capt" here is an informal title for "Captain". These titles may be less formal than those of military officers aboard navy ships.

    Following the input list of crew members, append output lists in these target languages: 1. German 2. French 3. Polish 4. Italian 5. Spanish 6. Portuguese 7. Russian 8. Czech 9. Hungarian 10. Dutch 11. Slovak 12. Danish 13. Swedish 14. Romanian 15. Turkish 16. Catalan.

    List of crew members:
        "#str_fm_map_shouldered_name_capt_riggs"    "Capt Riggs"
        "#str_fm_map_shouldered_name_first_mate_logah"    "First Mate Logah"
        "#str_fm_map_shouldered_name_second_mate_chaf"    "Second Mate Chaf"

    Results

    This time, ChatGPT did not place results in separate frames by language; because of fewer input strings? Also, between the title and name in every case was some odd “hollow box” character (possibly null character 0x00); I did a replace-all of those with <space>. What follows is the start of ChatGPT’s response (with my hollow-box correction).

    Here are the translations for each language. I’ve preserved tabs and identifiers exactly, translated the title portion into the historical/formal equivalent, and added brief back-translations (“//bt: …”) into English.

    1. German
    "#str_fm_map_shouldered_name_capt_riggs" "Kpt Riggs" //bt: Capt Riggs
    "#str_fm_map_shouldered_name_first_mate_logah" "Erster Maat Logah" //bt: First Mate Logah
    "#str_fm_map_shouldered_name_second_mate_chaf" "Zweiter Maat Chaf" //bt: Second Mate Chaf

    2. French
    "#str_fm_map_shouldered_name_capt_riggs" "Capt Riggs" //bt: Capt Riggs
    "#str_fm_map_shouldered_name_first_mate_logah" "Second Capitaine Logah" //bt: First Mate Logah
    "#str_fm_map_shouldered_name_second_mate_chaf" "Tiers Capitaine Chaf" //bt: Second Mate Chaf

    3. Polish
    "#str_fm_map_shouldered_name_capt_riggs" "Kapn Riggs" //bt: Capt Riggs
    "#str_fm_map_shouldered_name_first_mate_logah" "Pierwszy oficer Logah" //bt: First Mate Logah
    "#str_fm_map_shouldered_name_second_mate_chaf" "Drugi oficer Chaf" //bt: Second Mate Chaf

    4. Italian
    "#str_fm_map_shouldered_name_capt_riggs" "Capitano Riggs" //bt: Capt Riggs
    "#str_fm_map_shouldered_name_first_mate_logah" "Primo ufficiale Logah" //bt: First Mate Logah
    "#str_fm_map_shouldered_name_second_mate_chaf" "Secondo ufficiale Chaf" //bt: Second Mate Chaf
    ... etc.

    Other Observations

    As you can see above, ChatGPT neglected to provide a leading <tab> to each line, so I had to.

    I had forgotten to think about asking for “title case” in the prompt. Maybe just as well, since the 2 meanings of “title” (rank and capitalization) might have confused the AI. In the results, the person’s name was always left capitalized, and the first word. In a few languages, the second word was also capitalized; in most, it was not. I wasn’t sure if that’s how ranks are usually treated in various countries. I decided (unlike the inventory case) not to adjust the AI-provided casing.

    Finally, ChatGPT back-translated “Capt” always to “Capt”, never to “Captain”, even if the foreign language gave the full word, e.g.,  Capitaine. Probably evidence of its cheating here. I’m getting less confident in back-translation as a way to insure quality... at least when done by the same AI that does forward translation, and at the same time. Or maybe it’s just a ChatGPT shortcoming, and other AIs would do better.

     

    • Like 1
  6. Inventory Item Results

    The AI results, generated in a flash, were successful overall. The desired output format was achieved, and the back translations detected very few problems. An example result (before alphabetic ordering), from [Italian]:
        "#str_fm_map_inv_key_sea_trunk"    "Baule da mare"    Sea chest
        "#str_fm_map_inv_key_galley"    "Cucina di bordo"    Ship’s galley
        "#str_fm_map_inv_key_mess"    "Refettorio"    Mess
        "#str_fm_map_inv_key_master_key"    "Chìave maestra"    Master key
        ...
    Caveats: A judgement of “overall success” is without ground-truthing by human translators. The back-translation uses the same AI as the forward-translation... loosely analogous to “grading its own paper”. Though told not to look at the original English when back-translating, is it cheating and lying?

    Revisions and Refinements

    Title Casing for Inventory Items. I should have explicitly prompted to generate this in the results. I manually corrected these (leaving separators like di, de, do, del in lower case), but a few cases required ChatGPT consultation to get the proper upper-case Unicode letter.


    Mess. Two translation needed revision, because they seemed to suggest an eating area of a much larger or luxurious ship:
    [German]
        "#str_fm_map_inv_key_mess"    "Offiziersmesse"    Officers’ mess
    [French]
        "#str_fm_map_inv_key_mess"    "Salle à manger"    Dining room

    Rather than rerun the whole translation, I just asked ChatGPT about these cases:
     

    On a small ship, what is the mess or eating area called in French?
    On a small ship, the eating area is typically called la cambuse or le carré, depending on the context:
        Le carré – This is the standard French nautical term for the crew's or officers’ mess (eating/living area) on a ship. It’s equivalent to the English term mess.
            Carré des officiers – officers' mess
            Carré de l’équipage – crew mess
        La cambuse – Often refers to the ship’s food storage or sometimes the galley (kitchen), but in some informal or small-vessel contexts, it can be used more broadly, though this is less precise for "mess."
    So for a small ship’s mess or dining area, the best and most nautical French term is le carré.

    Consequently, I just replaced "Salle à manger" with “Carré”. Likewise, in German, simply “Messe”.

    Master Key. This was back-translated as “Main key” in 6 of the languages, which I guess is OK. In Air Pocket, a Master Key (entity name key_master) was left in the blue room, if needed for a debugging build; inaccessible otherwise. Arguably, it should not be translated at all. So I’ll comment those non-English lines out. (“Master key” is not part of TDM’s base strings?)

    Captain’s – Odd Punctuation and Perhaps Compound Unicode. The German translation of “Captain’s Cabin” had odd punctuation: "Kapitan­s¬kajüte" (and some evidence of a Unicode “combining character”). I did a separate follow up to ChatGPT, and revised to drop the punctuation and add an umlaut over the 'a': Kapitänskajüte.

    Another German use of “Captain’s” was similarly revised. German reportedly never uses apostrophe for possessive form.

    Captain’s – Title versus Name. There was one case in [Danish] where the word “Captain’s” was not translated, as if it was a person’s name. (Also, reportedly, Danish does not generally use apostrophes for possessives; there are exceptions, but doesn’t seem to apply here.)

    State of All.Lang So Far

    Starting from a temporary file into which I pasted the raw AI results (with [<language>] headers added), I fabricated all.lang by:

    • Making sure it had Unix line ending, not CRLF. (In Notepad++, Edit/EOL Conversion/Unix).
    • Begin it with a first-draft preamble comment, heavily adapted from TDM’s all.lang preamble.
    • Following that, a line with just an opening bracket. And a closing bracket line at end of file.
    • Making the handful of translation corrections mentioned above.
    • Change the casing to Title Case. (I didn’t bother changing the back-translation’s case.)
    • Tagging the back-translations with “//bt:”, so they are denoted and if need be can be quickly stripped out with an editor. (If subsequent revision is manually applied, the delimiter will also be altered; preamble will provide guidance.)

    Lessons Learned So Far

    Improvements to Prompting...

    • Specify that the FM’s ship is small.
    • Specify that “Captain” is a title, not a person’s name. (Hmm, there’s some shouldered names, not touched by I18N.pl, that maybe should be partially-translated too, with titles like “First Mate Logan”.)
    • For inventory items (and likely readables titles), ask the AI to make the output in Title Case.
    • Tell the AI not to generate Unicode combining characters.
    • Ask the AI to add a special delimiter “ //bt: “ before the back-translation.
    • To the extent possible, convert any directional punctuation (apostrophes, single quotes, double quotes) to non-directional, to comply with TDM font limitations.
    • Since it seems to give better results if you ask about one specific item (like “mess” in French), maybe it’s optimizing for speed instead of accuracy. Ask it to take more time?
    • ChatGPT translation seems to have problems with possessive forms... or at least those problems are more-easily spotted during review.
    • Speculation: maybe one cause of this is that I didn’t specify which country or regional dialect of a language to use. Perhaps a prompt to “prefer the form of language spoken in a language’s originating country, within or adjoining Europe.”

    Concerns about Translation Length...
    The results are generally short, but in-game will some of them prove to be too long? Traditionally, inventory names are limited to 2 lines, with “\n” needing to be inserted. This will need to be tested eventually.

     

    • Like 1
  7. Desired Results File with All Languages

    What is want to end up with from our AI (with any iterative fixup and manual integration) is an FM-only version of TDM’s UTF-8 all.lang.  That file has 17 language sections in a particular order, which we will adopt too (in the prompt further below), although, other than English being first, it doesn’t really matter.

    Once we have our FM all.lang, we can easily generate all the required ISO-encoded *.lang files, e.g., french.lang.

    Strategy of Feeding the AI

    One approach would be to just feed the entire #str list in one gulp, with prompt engineering that covers all aspects. This would minimize the post-translate integration time. But the concern is that prompt engineering becomes more difficult. The AI might get confused about what restrictions and hints apply to which strings. While sometimes shared context across strings can be helpful, too much shared context could lead to overly-creative translations (e.g., hallucinations).

    [BTW, if accessing the AI through an API, there’s often a "temperature" value you can specify, from 0.0 to 1.0, from most-predictable to most-creative. We can use a few words in the prompt to approximately achieve similar ends.]
    So, to maintain more control, I’m going to batch-feed. The assumption is that translating #str_ in batches of related input groups will allow more focused guidance from prompt engineering, leading to better results. I’ll start with inventory items, that have the shortest strings and most dictionary-like lookup.

    An alternative/additional batching (particularly needed with large FMs) would be by "scene".

    In the case of Air Pocket, it could be thought of broadly as 4 scenes, based on timeline and location. The story, as driven by objectives, is fairly linear; larger FMs would typically have some randomization in scene order.
    Would batching by scene be useful (i.e., give better AI results) for some of Air Pocket’s #str_ s? Thinking this over. But for now, treat inventory items independent of scene.

    Prompt Engineering for Inventory Items

    A stab at a reusable template follows in blue. It describes the overall translation task, the desired tone, and input and output formats. Text specific to inventory items is shown in boldText that is specific to this FM, to clarify the context and the meaning of particular words, is in italics, with spoilers hidden.

    You are an expert translator between English and other European languages, including Russian. You wish to translate a list of inventory items, all inanimate objects, from English to these other languages. Each line of the list begins with a tab, then a word beginning with #str_ within double quotes, then another tab, then a phrase within double quotes. When you translate a line, in the output keep the tabs and the #str_ word unchanged, and only change the phrase to the other target language in UTF-8, keeping it in double quotes. Make the translated phrase reasonably short while preserving the formal meaning. Avoid modern slang. Old-fashioned wording is fine. At the end of each line, add another tab, and then add a back-translation of the previous phrase into English again. When back-translating, ignore the original English phrase.

    Most of the inventory items are keys, and the associated phrases describe locked doors to particular locations aboard a ship, or locked trunks or safes on a ship. The "Master Key" opens all locks.

    Spoiler

    The only inventory item that is not a key is a gold ingot, labeled "Too Heavy", a warning that it weighs too much.

    Following the input list of inventory items, append output lists in these target languages: 1. German 2. French 3. Polish 4. Italian 5. Spanish 6. Portuguese 7. Russian 8. Czech 9. Hungarian 10. Dutch 11. Slovak 12. Danish 13. Swedish 14. Romanian 15. Turkish 16. Catalan.

    List of inventory items:
        [... skipping 1 potential spoiler]
        "#str_fm_map_inv_key_captains_cabin"    "Captain's Cabin"
        "#str_fm_map_inv_key_captains_safe"    "Captain's Safe"
        "#str_fm_map_inv_key_galley"    "Galley"
        "#str_fm_map_inv_key_master_key"    "Master Key"
        "#str_fm_map_inv_key_mess"    "Mess"
        "#str_fm_map_inv_key_sea_trunk"    "Sea Trunk"

    Using ChatGPT

    As discussed at the outset, you can use this without signing in (it will nag you). Also, if you’d like it not to retain your input for training purposes, click on the circled question mark and change it under "Settings".
    As of this post, of you ask ChatGPT what model it’s using, it responds "You're currently chatting with GPT-4o, the latest model from OpenAI as of 2025. The "o" stands for "omni" — it's designed to handle text, images, and more, all in one model."

    Following up by enquiring about usage limitations, it says "Free users can access GPT‑4o, but with strict usage caps, which vary based on demand and time of day". More specifically, "usage falls in the range of 5–16 messages per 3–5 hours, after which you'll be limited or switched to GPT‑4o‑mini." The latter is a faster but lower-accuracy model. "We’ll notify you once you’ve reached the limit and invite you to continue your conversation using GPT-4o mini or to upgrade to [paid] ChatGPT Plus."

    Because I’m doing this at a leisurely rate (and reporting it to you in posts), the usage restrictions should not bite.

    About Input and Output Formats

    As you can see above, the input is the AI prompt, appended with content from english.lang, namely, the lines between the "{" and "}" brackets. For those lines, no change to tab-separation is done.

    The output is the same format, but with an added English back-translation added to each line.

    When ChatGPT generates the response, each requested language is enclosed in its own HTML response frame, with a separate "copy" link. So you have to copy each link separately, pasting them successively into your FM-specific all.lang file while adding headers, e.g., [French].

    Also, the frame margin contains the word "vbnet". When I asked, ChatGPT indicates that’s the style of syntax highlighting applied to the results, based on source material, but it may be inappropriate and so ignorable. Which explains why the word "key" was always colored green.

    In the next post, I’ll discuss the specific results.

     

    • Like 1
  8. Renaming the Strings

    What I describe here is a bit of a fib, reflecting where I ended up, not the iterative process. And some renaming was refined after the first trial AI run.

    Before doing the translation, I’m going to rename all the FM-specific strings.

    Why? It’s fair to say the #str_ system was never a hit with DR developers, so features that would support it are scant. A way to make it less painful for mappers to inspect a post-conversion FM in DR is to change the #str_<5-numbers> to #str_<meaningful string>.

    By convention, <meaningful string> is limited to ASCII alphanumeric characters plus underscore, with no spaces. This generally does not include a version of the full English string (and there are length limitations), but something that clues the mapper. And groups things helpfully.

    More specifically, I’m going to rename all the FM-specific strings from #str_2xxxx to: 

    	#str_fm_<file_source><grouping_and_ordering>_<unique hint>

    The <grouping_and_ordering> substrings are chosen with an eye to both viewing in DR and group translation. (Later, I’ll talk about batch-processing strategies. The group-naming here does not use a “by scene” strategy.)

    So this is done first in english.lang and then the altered assets (in .map, .xd).
    As I search for “#str_” in assets, I am aware that some strings are TDM-level defined, so I should not rename them. These have stringID numbers under 20000, or in theory beginning with #str_main_menu. There were 2 of these found in airpocket: items with inv_name or inv_category given by #str_10052 and #str_02381.

    Specific Renamings and Examples

    I’ll be truncating example strings here for brevity and to reduce spoilers. The categories (and an example or two of each) follow.

    Darkmod.txt and Readme.txt. As mentioned, TDM doesn’t really handle translations of these. Just for completeness...

    	"#str_20000" ==> "#str_fm_darkmod_txt__title"	"Away 1: Air Pocket"
    	"#str_20001" ==> "#str_fm_darkmod_txt_desc"	"On the run from her husband, me and my girl. With a bribe to a ship's captain, we're away. What could go wrong now? Oh, dammit."

    Notice that here (and elsewhere) I’ve added an extra “_” before “title”. This is so, when sorted alphabetically, the title comes first before the description (in this case) or body.

    Mission Briefing. This tells a story, so be sure that its titles (if any), bodies, and pages are well-order to present to the AI. Since the briefing text is long and complex, I’ve opted not to include a #str <hint> for it.

    	"#str_20026" ==> "#str_fm_mission_briefing_xd_pg1_body"    "Life is sweet! It's been 3 days since Emily ran away with me, leaving behind her bastard husband. He's miles away now, as our tradeship Esmeralda plies herself down the coast.\n [...] \nAnyway, Emily's set up a cozy bunk for us in the bow galley.\n"

    Readables. Like the briefing, a readable often tells a story and should be well-ordered to present to the AI. I’ve chosen to skip hints for these.
     

    	"#str_20024" ==> "#str_fm_airpocket_xd_sheet_appointment_to_service_pg2__title"	"Further Instructions\n"
    	"#str_20025" ==> "#str_fm_airpocket_xd_sheet_appointment_to_service_pg2_body"	"\n\n\n\nAs agreed, the 2 new crewmates will bunk in the galley, and share the sea trunk nearest the door in the mess.\n\nCaptain Riggs\n"

    Inventory (mainly keys). The hints for these are usually just the full text, in lower case with punctuation dropped. They could include info about item class or difficulty, but I didn’t find it necessary here.

    	"#str_20021" ==> "#str_fm_map_inv_key_captains_cabin"	"Captain's Cabin"

    (I’ve went with “...fm_map...” here instead of “...fm_airpocket_map...”. I hope that works out. Too bad “map” itself has multiple TDM meanings, e.g., “hungarian.map”; in-game map. Also, since the order in which these inventory items are discovered in the game is pretty variable, I decided not to embed a numeral to force ordering.)

    Objectives. The objectives as coded in the .map file (and so generated english.lang #str_ numbers) are generally not listed in objective number order. It’s helpful to so reorder the rows. Also, because there are more than 9 objectives, I added a leading “0” to those with a single digit, for lexical ordering. The “obj<num>” is the DR/.map-internal objective number, which loosely corresponds to presentation to the game player, although objectives come and go.
     

    	"#str_20008" ==> "#str_fm_map_obj01_no_hurt_or_loot_crew"	"Rile neither the captain nor crew. No assaults by me or gratuitous thieving... not that there's much to steal."

    Thought messages. The I18N.pl script did not catch these, so they are manually added (and there’s no #str_2xxxx to begin with). I included an ordering digit after “msg”, roughly reflecting game occurrence (though difficulty and player actions will affect which of these appear and when):
     

    	"fm_map_thought_msg1_sword_stolen"	"My sword's been pilfered... I'm not surprised. It's likely still on-board somewhere."

    Ordering the Strings

    The #str_ lines in “english.lang” are extracted and put into ASCII alphanumeric order. I used Notepad++ with Edit/Line Operations/Sort Lines Lexigraphically Ascending. Any full-line grouping comments (beginning with //) will need to be manually repositioned after this.

    Programmatic Ideas to Better Support Renaming

    This would be helpful: A program that took the original 5-digit #str form of english.lang, and the renamed-#str english.lang form, and did the surgery on <fm>.map and *.xd .

    Or alternatively you could make a version of english.lang with an extra tab-separated field; this is what I created manually as a reference for myself, with lines like:

    	"#str_20003"	"#str_fm_map_inv_key_sea_trunk"	"Sea Trunk"


    Then have a program that took that and did the surgery, using the first 2 fields.

    Or most ambitious of all, a version of I18N.pl that created alphanumeric #str_ in the first place, instead of 5-digit ones. Using something like the naming scheme above, this is mostly straightforward. However, for strings that are long sentences, if you still wanted to automatically summarize them into a few words for the #str <hint>, you’d probably need to call upon AI.

    Next Up

    I’ll make a version of this data available a little later on. Next post: prompt engineering!

     

    • Like 1
  9. On 6/26/2025 at 2:07 AM, covert_caedes said:

    I know it's not necessary to finish this mission, but

    Spoiler

    Yes, the sword exists. In easy difficulty, it's in your sea trunk, unpilfered. In the other settings, it's either under a pillow/corpse in the disrupted crew bunk, or, in that same area, in the air pocket where the potions are (on the shelf but below water).

     

  10. Starting the Conversion with I18N.pl

    Folks have found that it is best to do the conversion only when an FM is complete, which is certainly the case here.

    As a prerequisite, I installed and tested Strawberry Perl on my Win11 dev box. It worked this time, unlike in the distant past on a different, Win10 machine, where I tried and failed for 3 days to get it to work.

    Then, as specified in our wiki’s I18N page, I set up a directory with the airpocket.pk4, and a copy of TDM 2.13 strings/english.lang. (The wiki specification missed that last step, which I corrected.)

    Running the I18N.pl script generated the expected results in its “output” folder: an altered airpocket.pk4 and new airpocket_l10n.pk4. Within both, a /strings/ folder was now present with just an english.lang file, with 28 #str_ entries, covering:

    • maps/airpocket.map
    • xdata/airpocket.xd
    • xdata/mission_briefing.xd
    • darkmod.txt
    • readme.txt

    However, I believe #str_ support for the last 2 was never implemented (and the output version of those files don’t include #str_ s). So this will be lowest priority for me.

    Also, I don’t see any auto-generated #str_ for the 5 player “thought messages”. Probably need to add these. English text for these is in the .map file (found by searching for tdm_message_no_art).

    • Like 1
  11. Scope of Work

    TDM currently supports a partial localization with a moderate amount of work. With a heroic effort, and a separate build on a separate site like DarkFate.org for Russian, one particular language can be fully supported. I'm not striving for that.
    Instead, I'm looking at AI-enabled moderate-effort partial localization across ALL TDM-supported languages for a particular FM.

    Choosing a Test FM 

    As a test case, I'm going to report on internationalizing my small FM "Away 1: Air Pocket" (install name is airpocket). This FM is an "internationalization virgin". Furthermore, here's what it DOESN'T have:

    • Custom video or speech audio
    • Custom inventory items or weapons
    • Custom images with embedded language text
    • Signage
    • Mission title & credits on splash screen (as either text overlay or embedded)

    Instead, these are my work targets, to translate to ALL languages:

    • Text briefing
    • Standard inventory items, mainly keys, but with custom names.
    • Objectives
    • A few sheet readables
    • Some "thought messages" from the player (in the form of white text in midscreen, done with atdm:gui_message with atdm_gui_message_no_art.gui)
    • Like 1
  12. Back in the day, Tels managed a squad of volunteer translators for TDM. I am not Tels, and could never do that.

    Nowadays, language translation using AI, either traditional machine learning (ML) models or large language models (LLMs), is common and increasingly fluent. It is often used as an adjunct to speed the work of professional human translators. By itself, AI translation can be imperfect but usually sufficient.

    Can this "sufficient" approach be used for TDM, to expedite translations? Let's see.

    I gave some initial thought to a bulk-translation daemon that might range across FMs and fill in all missing translations, without necessarily involving mappers. In the future, possibly AI could tackle that whole enchilada. I was at first visualizing something more modest: a backbone in a standard programming language (I sketched out C++ and C# projects, but lots of other possibilities) that would make calls to an API (I looked at those of Google Translate and ChatGPT).

    However, I changed focus due to certain concerns...

    1. Different FMs, and subsets with each FM, would likely have far better translations if they were properly grouped, ordered, and translated separately, with an appropriate context (e.g., phrase engineering) added. The FM's mapper is best placed to provide this grouping and context. I'll detail what I mean in the next few posts.
    2. The mapper would not be expected to know any TDM-supported languages besides English. Instead, each translated phrase could be back-translated to English and examined. Is the "round-trip" meaning OK, even if the English words have changed? Problematic translations could have their context tweaked and rerun.
    3. Many AI systems, particularly for API access, require a billing commitment (e.g., credit card).  For a professional translator, this is no problem, and subscriptions allow access to more (and putatively better) models and higher quotas. This seems less appealing for TDM. A few paid AI systems have a no-subscription, pay-as-you-go account tier. The cost per translate is typically pennies. But it does introduce quota- and expense-management, and may exclude API usage.
    4. Access via API requires an API key (or at the higher end more elaborate security regime), with attendant key-security headaches.
    5. Which AI model is thought "best" for translation? Doesn't matter too much, because we can't afford the best. Furthermore, there's endless churn among AI models, with antidotal reports that a given model fluctuates in quality over time, and successor models can be worse than their predecessors.

    So, with these concerns in mind, I looked for public web-based AI sites that require no billing and provide low-quota but adequate AI. The mapper would enter and retrieve data manually. I will focus on ChatGPT in this exploration, after a quick preliminary test confirmed some promise.

    Also, as this exploration proceeds, I hope to propose changes to TDM to make it more viable for "sufficient" quality machine translation. Problem areas are incomplete fonts, space-constraints, and layout issues for translated strings. My proposals will likely surface as separate forum threads.

    That's enough for now. I'll be trying for 1 or 2 substantive posts per week, as I tackle a particular FM.

    • Like 1
  13. Just to complicate your life, there are 3 additional aspects to consider about the circa-2014 Mason files, and subsequent circa-2017 improvements to the 'english' version perhaps applicable to your work. (These issues are covered in the wiki "Mason Font" article, with a bit more in my "Analysis of 2.12 TDM Fonts", https://forums.thedarkmod.com/index.php?/topic/22427-analysis-of-212-tdm-fonts/. The 2017 changes can be seen in the *current* 2.13 TDM English Mason files.)


    1) Need for custom DAT-scaling on certain Mason characters
    The source TTF had upper-case and lower-case characters that were early-on considered too similar to size. So (before 2014) in the DAT, selective per-character scaling was used to differentiate them.  See https://wiki.thedarkmod.com/index.php?title=Font_Metrics_%26_DAT_File_Format#Per-Character_Font_Scaling for details.

    As you add new characters, you should do likewise (relatively easy with refont).

    2) Creating the "glow" of mason_glow

    How Tels created the glow (for 'english' carleton & mason) is discussed in reasonable detail here:

    https://forums.thedarkmod.com/index.php?/topic/12863-translating-the-tdm-gui/page/5/#findComment-262661

    That could be done for Russian too, which I recall currently fakes a glow, and possibly would require a minor GUI or engine code change to use.

    Note: To best accommodate glow and retain GIMP-visualization-alignment between base and glow characters, Tels moved some base characters within their bitmap, to keep their glyphs 2-3 pixels away from any bitmap edge. You should consider this when placing new base glyphs.

    Note: For the 3 mason bitmaps doubled in size circa-2017 as discussed next, the mason_glow bitmaps were also doubled.

    3) Extensive bitmap editing to solve main menu character jaggedness.

    On Oct. 5, 2017, @Springheel in https://forums.thedarkmod.com/index.php?/topic/19129-menu-update/#findComment-412921 said:

    "Looking at the Mason fonts, it looks like they were super low res to begin with, and were then just resized [presumably referring to per-character scaling], making them even worse. I'll see what I can do." [Further on, referring to fonts in the TDM menu system:] "It appears that resizing the dds file to make it higher res is possible, so I'll proceed."

    Later, on Oct 13, 2017, he concluded within a "More detailed list of changes: "Updated the menu fonts, which were surprisingly bad before"

    Unfortunately, I couldn't find details on how this work was actually done. I assume the bitmap editing was all done in GIMP. It started with doubling the size of certain bitmaps from 256x256 to 512x512. This was done for the first 3 bitmaps (i.e., those with ASCII, some Latin-1). Then characters were made more crisp and smooth-edged. How? Dunno. Also, some odd but harmless artifacts happened within GIMP (noted in https://forums.thedarkmod.com/index.php?/topic/22427-analysis-of-212-tdm-fonts/page/3/#findComment-499660)

  14. 23 hours ago, kalinovka said:

    Today I spent 3 or 4 hours trying to edit ttf. i used Font Lab 8 for it.

    I opened the font file (it was in utf encoding). Then I tried to recode it into cp-1251 and failed.

    masonchronicles3.ttf has an unknown license status, of course. But if my attempts lead to good results, I can draw letters in ttf by myself.

    I haven't used Font Lab, but perhaps there's an online user community that can help.

    Drawing font glyphs is challenging but yes possible.

    23 hours ago, kalinovka said:

    Tels' unfinished files - are they still available ? Can I continue his work from where he stopped it ? That DDSes made by him are available in the TDM folders, but I have not found sources with .tda or ttf.

     

    And I still can't understand how to convert DAT into TGA or DDS.

    I can help you with Mason files, but I need to find the right version... I'll get back to you.

    As for DAT, think of it this way: a TDM file is described by BOTH DAT and TGA/DDS. DAT has the metadata needed for character spacing and scaling. TGA/DDS has the glyph images.

    See https://wiki.thedarkmod.com/index.php?title=Font_Metrics_%26_DAT_File_Format

  15. Regarding the existing Russian version of TDM's MasonAlternative font, this had a different origin than those Russian fonts processed by Riff_Keeper.

    Tels created this in 2012. He started from bitmaps of an ASCII Mason font, then used his Perl patch program to copy selected ASCII glyphs (that resemble in some way Cyrillic) to new font "MasonAlternative". See https://forums.thedarkmod.com/index.php?/topic/12863-translating-the-tdm-gui/page/15/#findComment-274617

    In GIMP, he flipped or otherwise hand-edited to make them Cyrillic. He said, "There are still a few dozen missing, but this is enough to render the two headlines we have (New Mission and Setting)"
    https://forums.thedarkmod.com/index.php?/topic/12863-translating-the-tdm-gui/page/15/#findComment-274623

    This accounts for the incomplete coverage.

    Speculatively, he took this approach because it couldn't find a Mason-style TTF font with both Russian characters and an acceptable license (e.g., public domain, or at the least freely redistributable for non-commercial use).

    @kalinovka,I wonder what the licensing is for your masonchronicles3.ttf.

  16. 1 hour ago, kalinovka said:

    It is interesting that Carleton font has been already converted to dds with all cyrillic letters. But who had done this work, I was not able to find.

    Evidently a significant portion of the Cyrillic work was done by Keeper_Riff (in conjunction with Tels) back in 2011. These folks are not active in TDM these days.

    Keeper_Riff outlined a workflow, starting with FontLab to edit TTF files...
    https://forums.thedarkmod.com/index.php?/topic/12863-translating-the-tdm-gui/page/12/#findComment-271548

    Specifically Carleton:
    https://forums.thedarkmod.com/index.php?/topic/12863-translating-the-tdm-gui/page/4/#findComment-262135

  17. 1 hour ago, kalinovka said:

    One more question - may you tell me how to transform dat files to dds ?

    I used the refont ulility, but it creates only a dat file. What should be done next ?

    Yes, refont just makes it easier to make changes to a .dat file (via human-editable .ref file).

    To transform your .tga's into .dds, either -

    But more typically, with GIMP, all the .tga's associated with one of the three TDM-supported font sizes are read in as separate layers, to a common GIMP project file (saved as an .xcf file). Ordinarily, you set only 1 layer visible.

    If you do any bitmap editing, the .xcf file becomes in effect the source master. Having all the files together as layers makes it easier to move a glyph (or copy parts of glyphs) from one layer to another.

    You use GIMP's Export feature (you specify the .dds extension up the top entry part to tell it the format). Be sure to generate mipmaps.

    See:

    https://wiki.thedarkmod.com/index.php?title=DDS_Creation_with_GIMP

    https://wiki.thedarkmod.com/index.php?title=Font_Bitmaps_in_DDS_Files
    particularly "Editing a Bitmap"


  18. @kalinovka, I'm assuming your TTF font codepoints are those of Unicode. A conversion to Win1251 would use this map:

    https://www.unicode.org/Public/MAPPINGS/VENDORS/MICSFT/WINDOWS/CP1251.TXT


    The lower 128 are just ASCII, where Unicode and Win1251 use the same number.

    The upper 128 contain both the 90 codepoints you mentioned (starting with 1025 aka 0x0401), interposed with other characters. A proper rendering for TDM would include all these characters.


    It is possible that some old-school font editor already has an export profile for Win1251.

    As for hacking up a variant of ExportFontDoom3All256, you'd need to write a static array, holding Unicode values from CP1251.TXT. You could just do the upper 128 if you want, as shown next.

    Example:

    const static unsigned long UnicodeFor1251::array[] =
    {
      0x0402, // 0x80 CYRILLIC CAPITAL LETTER DJE
      0x0403, // 0x81 CYRILLIC CAPITAL LETTER GJE
      ... // more tedious or fancy editing here
    }

    Then the code loop would be something like [NOT TESTED]:
     

        // Export all characters.
        unsigned long sourceCharacterCode;
        for (int outputCharacterCode = 0; outputCharacterCode < Font::numCharactersToExport; outputCharacterCode++)
        {
            if(outputCharacterCode < 128)
                sourceCharacterCode = outputCharacterCode;
            else
                sourceCharacterCode = UnicodeFor1251[outputCharacterCode - 128];
            bool okay = exportCharacter(sourceCharacterCode, outputCharacterCode);
            if (!okay)
            {
                std::cerr
                    << "Error: Unable to export character " 
                    << getCharacterCodeString(characterCode) << "."
                    << std::endl;
    
                return false;
            }
        }
        ...

    Further down in FontExporter.cpp, more changes...

    bool FontExporter::exportCharacter(unsigned long sourceCharacterCode, outputCharacterCode) // WAS single parameter characterCode
    {
        Doom3GlyphDescriptor* doom3GlyphDescriptor = 0;
    
        // Get the index of the glyph that represents this character.
        int glyphIndex = self.font->getGlyphIndexForCharacterCode(sourceCharacterCode); // WAS characterCode
       ...
                    // Create a descriptor for the current glyph.
                doom3GlyphDescriptor = &self.doom3GlyphDescriptors[outputCharacterCode]; // WAS characterCode
       ...
    }

    After a successul export, there's still lots more testing and tweaking to be done, e.g., with datBounds, refont, if you want best character spacing and presentation.
    Also, TDM treats codepoint 0xFF specially, as mentioned in https://wiki.thedarkmod.com/index.php?title=I18N_-_Charset

     

  19. @kalinovka, probably no quick solution. I imagine, with a font editor that reads/writes TTFs, you could relocate the Cyrillic down to the 0-255 range in a custom TTF, which could then be processed by ExportFont3All256.

    Or, if you know C++, you could make a variant version of ExportFontDoom3All256 with a different input range (both start and end) in the loop.* The wiki page contains a link to the source code for a Visual Studio build.

    In either case, you'd want to order the glyphs (or glyph processing) as Win-1251 (and TDM) expects, so the generated .DAT files would require minimal fixup.

    * Specifically, you'd start with FontExporter.cpp, and in function FontExporter::export_, change the loop indices of:
        // Export all characters.
        for (int characterCode = 0; characterCode < Font::numCharactersToExport; characterCode++)

      ...

    But if that was all it took, I'd be very surprised.

×
×
  • Create New...