Geep's Content - The Dark Mod Forums

Using AI for TDM Translations: An Exploration

Geep replied to Geep's topic in The Dark Mod

Yet More #str_ Added – Shouldered Names The I18N.pl script does not automatically localize characters’ “name”, “shouldered_name” and “shouldered_name_dead”. And that’s generally the right call. You don’t necessarily want to translate character names. For instance, I prefer my character “Emily” be left that way, i.e., not rendered as for instance “Émilie” in French. (And I’m unclear whether the “name” spawnarg should ever get the #str treatment. So really just looking at “shouldered_name” and “shouldered_name_dead”.) But I do have 3 characters that have not just name, but also title/rank, and it might be worthwhile to translate the title/rank, along the lines: "#str_fm_map_shouldered_name_capt_riggs" "Capt Riggs" "#str_fm_map_shouldered_name_first_mate_logah" "First Mate Logah" "#str_fm_map_shouldered_name_second_mate_chaf" "Second Mate Chaf" By story design, only Logah is actually shoulderable, but for completeness I’ll do all three. I’m using the same #str_ for both “shouldered_name” and “shouldered_name_dead”. BTW, an argument could be made that the character #str_ renames should include the class, e.g., ..._atdm_ai_townsfolk_wench_..., ..._atdm_env_ragdoll_guard_thug_, etc. However, a class name, while explaining the visual appearance, can be somewhat remote from the precise role that the AI plays in the FM. Prompt to ChatGPT With Air Pocket specifics in italics. You are an expert translator between English and other European languages, including Russian. You wish to translate a list of crew members on a small historic sailing ship, from English to these other languages. Each line of the list begins with a tab, then a word beginning with #str_ within double quotes, then another tab, then a phrase within double quotes. When you translate a line, in the output keep the tabs and the #str_ word unchanged, and only change the phrase to the other target language in UTF-8, keeping it in double quotes. Make the translated phrase reasonably short while preserving the formal meaning. Avoid modern slang. Old-fashioned wording is fine. At the end of each line, add another tab, the fixed text "//bt: ", and then a back-translation of the previously translated phrase into English again. When back-translating, ignore the original English phrase. The crew members names are Riggs, Logah, and Chaf, all males. Keep the name-portion unchanged, but translate the title-portion. Some centuries ago, they were on a small coastal sailing ship, trading in goods along the coast. In that time frame, take account of different countries having different merchant marine titles aka ranks. "Capt" here is an informal title for "Captain". These titles may be less formal than those of military officers aboard navy ships. Following the input list of crew members, append output lists in these target languages: 1. German 2. French 3. Polish 4. Italian 5. Spanish 6. Portuguese 7. Russian 8. Czech 9. Hungarian 10. Dutch 11. Slovak 12. Danish 13. Swedish 14. Romanian 15. Turkish 16. Catalan. List of crew members: "#str_fm_map_shouldered_name_capt_riggs" "Capt Riggs" "#str_fm_map_shouldered_name_first_mate_logah" "First Mate Logah" "#str_fm_map_shouldered_name_second_mate_chaf" "Second Mate Chaf" Results This time, ChatGPT did not place results in separate frames by language; because of fewer input strings? Also, between the title and name in every case was some odd “hollow box” character (possibly null character 0x00); I did a replace-all of those with <space>. What follows is the start of ChatGPT’s response (with my hollow-box correction). Here are the translations for each language. I’ve preserved tabs and identifiers exactly, translated the title portion into the historical/formal equivalent, and added brief back-translations (“//bt: …”) into English. 1. German "#str_fm_map_shouldered_name_capt_riggs" "Kpt Riggs" //bt: Capt Riggs "#str_fm_map_shouldered_name_first_mate_logah" "Erster Maat Logah" //bt: First Mate Logah "#str_fm_map_shouldered_name_second_mate_chaf" "Zweiter Maat Chaf" //bt: Second Mate Chaf 2. French "#str_fm_map_shouldered_name_capt_riggs" "Capt Riggs" //bt: Capt Riggs "#str_fm_map_shouldered_name_first_mate_logah" "Second Capitaine Logah" //bt: First Mate Logah "#str_fm_map_shouldered_name_second_mate_chaf" "Tiers Capitaine Chaf" //bt: Second Mate Chaf 3. Polish "#str_fm_map_shouldered_name_capt_riggs" "Kapn Riggs" //bt: Capt Riggs "#str_fm_map_shouldered_name_first_mate_logah" "Pierwszy oficer Logah" //bt: First Mate Logah "#str_fm_map_shouldered_name_second_mate_chaf" "Drugi oficer Chaf" //bt: Second Mate Chaf 4. Italian "#str_fm_map_shouldered_name_capt_riggs" "Capitano Riggs" //bt: Capt Riggs "#str_fm_map_shouldered_name_first_mate_logah" "Primo ufficiale Logah" //bt: First Mate Logah "#str_fm_map_shouldered_name_second_mate_chaf" "Secondo ufficiale Chaf" //bt: Second Mate Chaf ... etc. Other Observations As you can see above, ChatGPT neglected to provide a leading <tab> to each line, so I had to. I had forgotten to think about asking for “title case” in the prompt. Maybe just as well, since the 2 meanings of “title” (rank and capitalization) might have confused the AI. In the results, the person’s name was always left capitalized, and the first word. In a few languages, the second word was also capitalized; in most, it was not. I wasn’t sure if that’s how ranks are usually treated in various countries. I decided (unlike the inventory case) not to adjust the AI-provided casing. Finally, ChatGPT back-translated “Capt” always to “Capt”, never to “Captain”, even if the foreign language gave the full word, e.g., Capitaine. Probably evidence of its cheating here. I’m getting less confident in back-translation as a way to insure quality... at least when done by the same AI that does forward translation, and at the same time. Or maybe it’s just a ChatGPT shortcoming, and other AIs would do better.

Saturday at 05:43 PM
12 replies

Using AI for TDM Translations: An Exploration

Geep replied to Geep's topic in The Dark Mod

Inventory Item Results The AI results, generated in a flash, were successful overall. The desired output format was achieved, and the back translations detected very few problems. An example result (before alphabetic ordering), from [Italian]: "#str_fm_map_inv_key_sea_trunk" "Baule da mare" Sea chest "#str_fm_map_inv_key_galley" "Cucina di bordo" Ship’s galley "#str_fm_map_inv_key_mess" "Refettorio" Mess "#str_fm_map_inv_key_master_key" "Chìave maestra" Master key ... Caveats: A judgement of “overall success” is without ground-truthing by human translators. The back-translation uses the same AI as the forward-translation... loosely analogous to “grading its own paper”. Though told not to look at the original English when back-translating, is it cheating and lying? Revisions and Refinements Title Casing for Inventory Items. I should have explicitly prompted to generate this in the results. I manually corrected these (leaving separators like di, de, do, del in lower case), but a few cases required ChatGPT consultation to get the proper upper-case Unicode letter. Mess. Two translation needed revision, because they seemed to suggest an eating area of a much larger or luxurious ship: [German] "#str_fm_map_inv_key_mess" "Offiziersmesse" Officers’ mess [French] "#str_fm_map_inv_key_mess" "Salle à manger" Dining room Rather than rerun the whole translation, I just asked ChatGPT about these cases: On a small ship, what is the mess or eating area called in French? On a small ship, the eating area is typically called la cambuse or le carré, depending on the context: Le carré – This is the standard French nautical term for the crew's or officers’ mess (eating/living area) on a ship. It’s equivalent to the English term mess. Carré des officiers – officers' mess Carré de l’équipage – crew mess La cambuse – Often refers to the ship’s food storage or sometimes the galley (kitchen), but in some informal or small-vessel contexts, it can be used more broadly, though this is less precise for "mess." So for a small ship’s mess or dining area, the best and most nautical French term is le carré. Consequently, I just replaced "Salle à manger" with “Carré”. Likewise, in German, simply “Messe”. Master Key. This was back-translated as “Main key” in 6 of the languages, which I guess is OK. In Air Pocket, a Master Key (entity name key_master) was left in the blue room, if needed for a debugging build; inaccessible otherwise. Arguably, it should not be translated at all. So I’ll comment those non-English lines out. (“Master key” is not part of TDM’s base strings?) Captain’s – Odd Punctuation and Perhaps Compound Unicode. The German translation of “Captain’s Cabin” had odd punctuation: "Kapitans¬kajüte" (and some evidence of a Unicode “combining character”). I did a separate follow up to ChatGPT, and revised to drop the punctuation and add an umlaut over the 'a': Kapitänskajüte. Another German use of “Captain’s” was similarly revised. German reportedly never uses apostrophe for possessive form. Captain’s – Title versus Name. There was one case in [Danish] where the word “Captain’s” was not translated, as if it was a person’s name. (Also, reportedly, Danish does not generally use apostrophes for possessives; there are exceptions, but doesn’t seem to apply here.) State of All.Lang So Far Starting from a temporary file into which I pasted the raw AI results (with [<language>] headers added), I fabricated all.lang by: Making sure it had Unix line ending, not CRLF. (In Notepad++, Edit/EOL Conversion/Unix). Begin it with a first-draft preamble comment, heavily adapted from TDM’s all.lang preamble. Following that, a line with just an opening bracket. And a closing bracket line at end of file. Making the handful of translation corrections mentioned above. Change the casing to Title Case. (I didn’t bother changing the back-translation’s case.) Tagging the back-translations with “//bt:”, so they are denoted and if need be can be quickly stripped out with an editor. (If subsequent revision is manually applied, the delimiter will also be altered; preamble will provide guidance.) Lessons Learned So Far Improvements to Prompting... Specify that the FM’s ship is small. Specify that “Captain” is a title, not a person’s name. (Hmm, there’s some shouldered names, not touched by I18N.pl, that maybe should be partially-translated too, with titles like “First Mate Logan”.) For inventory items (and likely readables titles), ask the AI to make the output in Title Case. Tell the AI not to generate Unicode combining characters. Ask the AI to add a special delimiter “ //bt: “ before the back-translation. To the extent possible, convert any directional punctuation (apostrophes, single quotes, double quotes) to non-directional, to comply with TDM font limitations. Since it seems to give better results if you ask about one specific item (like “mess” in French), maybe it’s optimizing for speed instead of accuracy. Ask it to take more time? ChatGPT translation seems to have problems with possessive forms... or at least those problems are more-easily spotted during review. Speculation: maybe one cause of this is that I didn’t specify which country or regional dialect of a language to use. Perhaps a prompt to “prefer the form of language spoken in a language’s originating country, within or adjoining Europe.” Concerns about Translation Length... The results are generally short, but in-game will some of them prove to be too long? Traditionally, inventory names are limited to 2 lines, with “\n” needing to be inserted. This will need to be tested eventually.

Thursday at 11:47 PM
12 replies

Using AI for TDM Translations: An Exploration

Geep replied to Geep's topic in The Dark Mod

Desired Results File with All Languages What is want to end up with from our AI (with any iterative fixup and manual integration) is an FM-only version of TDM’s UTF-8 all.lang. That file has 17 language sections in a particular order, which we will adopt too (in the prompt further below), although, other than English being first, it doesn’t really matter. Once we have our FM all.lang, we can easily generate all the required ISO-encoded *.lang files, e.g., french.lang. Strategy of Feeding the AI One approach would be to just feed the entire #str list in one gulp, with prompt engineering that covers all aspects. This would minimize the post-translate integration time. But the concern is that prompt engineering becomes more difficult. The AI might get confused about what restrictions and hints apply to which strings. While sometimes shared context across strings can be helpful, too much shared context could lead to overly-creative translations (e.g., hallucinations). [BTW, if accessing the AI through an API, there’s often a "temperature" value you can specify, from 0.0 to 1.0, from most-predictable to most-creative. We can use a few words in the prompt to approximately achieve similar ends.] So, to maintain more control, I’m going to batch-feed. The assumption is that translating #str_ in batches of related input groups will allow more focused guidance from prompt engineering, leading to better results. I’ll start with inventory items, that have the shortest strings and most dictionary-like lookup. An alternative/additional batching (particularly needed with large FMs) would be by "scene". In the case of Air Pocket, it could be thought of broadly as 4 scenes, based on timeline and location. The story, as driven by objectives, is fairly linear; larger FMs would typically have some randomization in scene order. Would batching by scene be useful (i.e., give better AI results) for some of Air Pocket’s #str_ s? Thinking this over. But for now, treat inventory items independent of scene. Prompt Engineering for Inventory Items A stab at a reusable template follows in blue. It describes the overall translation task, the desired tone, and input and output formats. Text specific to inventory items is shown in bold. Text that is specific to this FM, to clarify the context and the meaning of particular words, is in italics, with spoilers hidden. You are an expert translator between English and other European languages, including Russian. You wish to translate a list of inventory items, all inanimate objects, from English to these other languages. Each line of the list begins with a tab, then a word beginning with #str_ within double quotes, then another tab, then a phrase within double quotes. When you translate a line, in the output keep the tabs and the #str_ word unchanged, and only change the phrase to the other target language in UTF-8, keeping it in double quotes. Make the translated phrase reasonably short while preserving the formal meaning. Avoid modern slang. Old-fashioned wording is fine. At the end of each line, add another tab, and then add a back-translation of the previous phrase into English again. When back-translating, ignore the original English phrase. Most of the inventory items are keys, and the associated phrases describe locked doors to particular locations aboard a ship, or locked trunks or safes on a ship. The "Master Key" opens all locks. Following the input list of inventory items, append output lists in these target languages: 1. German 2. French 3. Polish 4. Italian 5. Spanish 6. Portuguese 7. Russian 8. Czech 9. Hungarian 10. Dutch 11. Slovak 12. Danish 13. Swedish 14. Romanian 15. Turkish 16. Catalan. List of inventory items: [... skipping 1 potential spoiler] "#str_fm_map_inv_key_captains_cabin" "Captain's Cabin" "#str_fm_map_inv_key_captains_safe" "Captain's Safe" "#str_fm_map_inv_key_galley" "Galley" "#str_fm_map_inv_key_master_key" "Master Key" "#str_fm_map_inv_key_mess" "Mess" "#str_fm_map_inv_key_sea_trunk" "Sea Trunk" Using ChatGPT As discussed at the outset, you can use this without signing in (it will nag you). Also, if you’d like it not to retain your input for training purposes, click on the circled question mark and change it under "Settings". As of this post, of you ask ChatGPT what model it’s using, it responds "You're currently chatting with GPT-4o, the latest model from OpenAI as of 2025. The "o" stands for "omni" — it's designed to handle text, images, and more, all in one model." Following up by enquiring about usage limitations, it says "Free users can access GPT‑4o, but with strict usage caps, which vary based on demand and time of day". More specifically, "usage falls in the range of 5–16 messages per 3–5 hours, after which you'll be limited or switched to GPT‑4o‑mini." The latter is a faster but lower-accuracy model. "We’ll notify you once you’ve reached the limit and invite you to continue your conversation using GPT-4o mini or to upgrade to [paid] ChatGPT Plus." Because I’m doing this at a leisurely rate (and reporting it to you in posts), the usage restrictions should not bite. About Input and Output Formats As you can see above, the input is the AI prompt, appended with content from english.lang, namely, the lines between the "{" and "}" brackets. For those lines, no change to tab-separation is done. The output is the same format, but with an added English back-translation added to each line. When ChatGPT generates the response, each requested language is enclosed in its own HTML response frame, with a separate "copy" link. So you have to copy each link separately, pasting them successively into your FM-specific all.lang file while adding headers, e.g., [French]. Also, the frame margin contains the word "vbnet". When I asked, ChatGPT indicates that’s the style of syntax highlighting applied to the results, based on source material, but it may be inappropriate and so ignorable. Which explains why the word "key" was always colored green. In the next post, I’ll discuss the specific results.

June 30
12 replies
- 1

Using AI for TDM Translations: An Exploration

Geep replied to Geep's topic in The Dark Mod

Could be good, could be a cockup. Something to explore for the future. For now, moving on to the translations.

June 27
12 replies

Using AI for TDM Translations: An Exploration

Geep replied to Geep's topic in The Dark Mod

Renaming the Strings What I describe here is a bit of a fib, reflecting where I ended up, not the iterative process. And some renaming was refined after the first trial AI run. Before doing the translation, I’m going to rename all the FM-specific strings. Why? It’s fair to say the #str_ system was never a hit with DR developers, so features that would support it are scant. A way to make it less painful for mappers to inspect a post-conversion FM in DR is to change the #str_<5-numbers> to #str_<meaningful string>. By convention, <meaningful string> is limited to ASCII alphanumeric characters plus underscore, with no spaces. This generally does not include a version of the full English string (and there are length limitations), but something that clues the mapper. And groups things helpfully. More specifically, I’m going to rename all the FM-specific strings from #str_2xxxx to: #str_fm_<file_source><grouping_and_ordering>_<unique hint> The <grouping_and_ordering> substrings are chosen with an eye to both viewing in DR and group translation. (Later, I’ll talk about batch-processing strategies. The group-naming here does not use a “by scene” strategy.) So this is done first in english.lang and then the altered assets (in .map, .xd). As I search for “#str_” in assets, I am aware that some strings are TDM-level defined, so I should not rename them. These have stringID numbers under 20000, or in theory beginning with #str_main_menu. There were 2 of these found in airpocket: items with inv_name or inv_category given by #str_10052 and #str_02381. Specific Renamings and Examples I’ll be truncating example strings here for brevity and to reduce spoilers. The categories (and an example or two of each) follow. Darkmod.txt and Readme.txt. As mentioned, TDM doesn’t really handle translations of these. Just for completeness... "#str_20000" ==> "#str_fm_darkmod_txt__title" "Away 1: Air Pocket" "#str_20001" ==> "#str_fm_darkmod_txt_desc" "On the run from her husband, me and my girl. With a bribe to a ship's captain, we're away. What could go wrong now? Oh, dammit." Notice that here (and elsewhere) I’ve added an extra “_” before “title”. This is so, when sorted alphabetically, the title comes first before the description (in this case) or body. Mission Briefing. This tells a story, so be sure that its titles (if any), bodies, and pages are well-order to present to the AI. Since the briefing text is long and complex, I’ve opted not to include a #str <hint> for it. "#str_20026" ==> "#str_fm_mission_briefing_xd_pg1_body" "Life is sweet! It's been 3 days since Emily ran away with me, leaving behind her bastard husband. He's miles away now, as our tradeship Esmeralda plies herself down the coast.\n [...] \nAnyway, Emily's set up a cozy bunk for us in the bow galley.\n" Readables. Like the briefing, a readable often tells a story and should be well-ordered to present to the AI. I’ve chosen to skip hints for these. "#str_20024" ==> "#str_fm_airpocket_xd_sheet_appointment_to_service_pg2__title" "Further Instructions\n" "#str_20025" ==> "#str_fm_airpocket_xd_sheet_appointment_to_service_pg2_body" "\n\n\n\nAs agreed, the 2 new crewmates will bunk in the galley, and share the sea trunk nearest the door in the mess.\n\nCaptain Riggs\n" Inventory (mainly keys). The hints for these are usually just the full text, in lower case with punctuation dropped. They could include info about item class or difficulty, but I didn’t find it necessary here. "#str_20021" ==> "#str_fm_map_inv_key_captains_cabin" "Captain's Cabin" (I’ve went with “...fm_map...” here instead of “...fm_airpocket_map...”. I hope that works out. Too bad “map” itself has multiple TDM meanings, e.g., “hungarian.map”; in-game map. Also, since the order in which these inventory items are discovered in the game is pretty variable, I decided not to embed a numeral to force ordering.) Objectives. The objectives as coded in the .map file (and so generated english.lang #str_ numbers) are generally not listed in objective number order. It’s helpful to so reorder the rows. Also, because there are more than 9 objectives, I added a leading “0” to those with a single digit, for lexical ordering. The “obj<num>” is the DR/.map-internal objective number, which loosely corresponds to presentation to the game player, although objectives come and go. "#str_20008" ==> "#str_fm_map_obj01_no_hurt_or_loot_crew" "Rile neither the captain nor crew. No assaults by me or gratuitous thieving... not that there's much to steal." Thought messages. The I18N.pl script did not catch these, so they are manually added (and there’s no #str_2xxxx to begin with). I included an ordering digit after “msg”, roughly reflecting game occurrence (though difficulty and player actions will affect which of these appear and when): "fm_map_thought_msg1_sword_stolen" "My sword's been pilfered... I'm not surprised. It's likely still on-board somewhere." Ordering the Strings The #str_ lines in “english.lang” are extracted and put into ASCII alphanumeric order. I used Notepad++ with Edit/Line Operations/Sort Lines Lexigraphically Ascending. Any full-line grouping comments (beginning with //) will need to be manually repositioned after this. Programmatic Ideas to Better Support Renaming This would be helpful: A program that took the original 5-digit #str form of english.lang, and the renamed-#str english.lang form, and did the surgery on <fm>.map and *.xd . Or alternatively you could make a version of english.lang with an extra tab-separated field; this is what I created manually as a reference for myself, with lines like: "#str_20003" "#str_fm_map_inv_key_sea_trunk" "Sea Trunk" Then have a program that took that and did the surgery, using the first 2 fields. Or most ambitious of all, a version of I18N.pl that created alphanumeric #str_ in the first place, instead of 5-digit ones. Using something like the naming scheme above, this is mostly straightforward. However, for strings that are long sentences, if you still wanted to automatically summarize them into a few words for the #str <hint>, you’d probably need to call upon AI. Next Up I’ll make a version of this data available a little later on. Next post: prompt engineering!

June 27
12 replies
- 1

Newbie DarkRadiant Questions

Geep replied to demagogue's topic in TDM Editors Guild

I know sometimes in the past I've misconfigured DR's File/"Game/Project Setting..."/ and gotten mystery problems similar to what you describe.

June 27
12050 replies

Fan Mission: Away 1 - Air Pocket, by Geep (2020/02/07)

Geep replied to Geep's topic in Fan Missions

June 27
63 replies

Using AI for TDM Translations: An Exploration

Geep replied to Geep's topic in The Dark Mod

Starting the Conversion with I18N.pl Folks have found that it is best to do the conversion only when an FM is complete, which is certainly the case here. As a prerequisite, I installed and tested Strawberry Perl on my Win11 dev box. It worked this time, unlike in the distant past on a different, Win10 machine, where I tried and failed for 3 days to get it to work. Then, as specified in our wiki’s I18N page, I set up a directory with the airpocket.pk4, and a copy of TDM 2.13 strings/english.lang. (The wiki specification missed that last step, which I corrected.) Running the I18N.pl script generated the expected results in its “output” folder: an altered airpocket.pk4 and new airpocket_l10n.pk4. Within both, a /strings/ folder was now present with just an english.lang file, with 28 #str_ entries, covering: maps/airpocket.map xdata/airpocket.xd xdata/mission_briefing.xd darkmod.txt readme.txt However, I believe #str_ support for the last 2 was never implemented (and the output version of those files don’t include #str_ s). So this will be lowest priority for me. Also, I don’t see any auto-generated #str_ for the 5 player “thought messages”. Probably need to add these. English text for these is in the .map file (found by searching for tdm_message_no_art).

June 26
12 replies
- 1

Using AI for TDM Translations: An Exploration

Geep replied to Geep's topic in The Dark Mod

Scope of Work TDM currently supports a partial localization with a moderate amount of work. With a heroic effort, and a separate build on a separate site like DarkFate.org for Russian, one particular language can be fully supported. I'm not striving for that. Instead, I'm looking at AI-enabled moderate-effort partial localization across ALL TDM-supported languages for a particular FM. Choosing a Test FM As a test case, I'm going to report on internationalizing my small FM "Away 1: Air Pocket" (install name is airpocket). This FM is an "internationalization virgin". Furthermore, here's what it DOESN'T have: Custom video or speech audio Custom inventory items or weapons Custom images with embedded language text Signage Mission title & credits on splash screen (as either text overlay or embedded) Instead, these are my work targets, to translate to ALL languages: Text briefing Standard inventory items, mainly keys, but with custom names. Objectives A few sheet readables Some "thought messages" from the player (in the form of white text in midscreen, done with atdm:gui_message with atdm_gui_message_no_art.gui)

June 23
12 replies
- 1

Using AI for TDM Translations: An Exploration

Geep replied to Geep's topic in The Dark Mod

With prompt engineering. Be starting to detail that soon. For now, here's my next post...

June 23
12 replies

Using AI for TDM Translations: An Exploration

Geep replied to Geep's topic in The Dark Mod

Good to know about, particularly if ChatGPT doesn't pan out.

June 22
12 replies

Using AI for TDM Translations: An Exploration

Geep posted a topic in The Dark Mod

Back in the day, Tels managed a squad of volunteer translators for TDM. I am not Tels, and could never do that. Nowadays, language translation using AI, either traditional machine learning (ML) models or large language models (LLMs), is common and increasingly fluent. It is often used as an adjunct to speed the work of professional human translators. By itself, AI translation can be imperfect but usually sufficient. Can this "sufficient" approach be used for TDM, to expedite translations? Let's see. I gave some initial thought to a bulk-translation daemon that might range across FMs and fill in all missing translations, without necessarily involving mappers. In the future, possibly AI could tackle that whole enchilada. I was at first visualizing something more modest: a backbone in a standard programming language (I sketched out C++ and C# projects, but lots of other possibilities) that would make calls to an API (I looked at those of Google Translate and ChatGPT). However, I changed focus due to certain concerns... Different FMs, and subsets with each FM, would likely have far better translations if they were properly grouped, ordered, and translated separately, with an appropriate context (e.g., phrase engineering) added. The FM's mapper is best placed to provide this grouping and context. I'll detail what I mean in the next few posts. The mapper would not be expected to know any TDM-supported languages besides English. Instead, each translated phrase could be back-translated to English and examined. Is the "round-trip" meaning OK, even if the English words have changed? Problematic translations could have their context tweaked and rerun. Many AI systems, particularly for API access, require a billing commitment (e.g., credit card). For a professional translator, this is no problem, and subscriptions allow access to more (and putatively better) models and higher quotas. This seems less appealing for TDM. A few paid AI systems have a no-subscription, pay-as-you-go account tier. The cost per translate is typically pennies. But it does introduce quota- and expense-management, and may exclude API usage. Access via API requires an API key (or at the higher end more elaborate security regime), with attendant key-security headaches. Which AI model is thought "best" for translation? Doesn't matter too much, because we can't afford the best. Furthermore, there's endless churn among AI models, with antidotal reports that a given model fluctuates in quality over time, and successor models can be worse than their predecessors. So, with these concerns in mind, I looked for public web-based AI sites that require no billing and provide low-quota but adequate AI. The mapper would enter and retrieve data manually. I will focus on ChatGPT in this exploration, after a quick preliminary test confirmed some promise. Also, as this exploration proceeds, I hope to propose changes to TDM to make it more viable for "sufficient" quality machine translation. Problem areas are incomplete fonts, space-constraints, and layout issues for translated strings. My proposals will likely surface as separate forum threads. That's enough for now. I'll be trying for 1 or 2 substantive posts per week, as I tackle a particular FM.

June 21
12 replies
- 1

Font localization

Geep replied to kalinovka's topic in TDM Tech Support

Just to complicate your life, there are 3 additional aspects to consider about the circa-2014 Mason files, and subsequent circa-2017 improvements to the 'english' version perhaps applicable to your work. (These issues are covered in the wiki "Mason Font" article, with a bit more in my "Analysis of 2.12 TDM Fonts", https://forums.thedarkmod.com/index.php?/topic/22427-analysis-of-212-tdm-fonts/. The 2017 changes can be seen in the *current* 2.13 TDM English Mason files.) 1) Need for custom DAT-scaling on certain Mason characters The source TTF had upper-case and lower-case characters that were early-on considered too similar to size. So (before 2014) in the DAT, selective per-character scaling was used to differentiate them. See https://wiki.thedarkmod.com/index.php?title=Font_Metrics_%26_DAT_File_Format#Per-Character_Font_Scaling for details. As you add new characters, you should do likewise (relatively easy with refont). 2) Creating the "glow" of mason_glow How Tels created the glow (for 'english' carleton & mason) is discussed in reasonable detail here: https://forums.thedarkmod.com/index.php?/topic/12863-translating-the-tdm-gui/page/5/#findComment-262661 That could be done for Russian too, which I recall currently fakes a glow, and possibly would require a minor GUI or engine code change to use. Note: To best accommodate glow and retain GIMP-visualization-alignment between base and glow characters, Tels moved some base characters within their bitmap, to keep their glyphs 2-3 pixels away from any bitmap edge. You should consider this when placing new base glyphs. Note: For the 3 mason bitmaps doubled in size circa-2017 as discussed next, the mason_glow bitmaps were also doubled. 3) Extensive bitmap editing to solve main menu character jaggedness. On Oct. 5, 2017, @Springheel in https://forums.thedarkmod.com/index.php?/topic/19129-menu-update/#findComment-412921 said: "Looking at the Mason fonts, it looks like they were super low res to begin with, and were then just resized [presumably referring to per-character scaling], making them even worse. I'll see what I can do." [Further on, referring to fonts in the TDM menu system:] "It appears that resizing the dds file to make it higher res is possible, so I'll proceed." Later, on Oct 13, 2017, he concluded within a "More detailed list of changes: "Updated the menu fonts, which were surprisingly bad before" Unfortunately, I couldn't find details on how this work was actually done. I assume the bitmap editing was all done in GIMP. It started with doubling the size of certain bitmaps from 256x256 to 512x512. This was done for the first 3 bitmaps (i.e., those with ASCII, some Latin-1). Then characters were made more crisp and smooth-edged. How? Dunno. Also, some odd but harmless artifacts happened within GIMP (noted in https://forums.thedarkmod.com/index.php?/topic/22427-analysis-of-212-tdm-fonts/page/3/#findComment-499660)

June 2
14 replies

Font localization

Geep replied to kalinovka's topic in TDM Tech Support

Took a while locate files and make things coherent. @kalinovka, read the contained "About..." files for more https://drive.google.com/file/d/1P977mRCtn9JuAJR1E9SLTS8XeFz_XvNz/view?usp=sharing

May 31
14 replies

Font localization

Geep replied to kalinovka's topic in TDM Tech Support

I haven't used Font Lab, but perhaps there's an online user community that can help. Drawing font glyphs is challenging but yes possible. I can help you with Mason files, but I need to find the right version... I'll get back to you. As for DAT, think of it this way: a TDM file is described by BOTH DAT and TGA/DDS. DAT has the metadata needed for character spacing and scaling. TGA/DDS has the glyph images. See https://wiki.thedarkmod.com/index.php?title=Font_Metrics_%26_DAT_File_Format

May 30
14 replies

Font localization

Geep replied to kalinovka's topic in TDM Tech Support

Regarding the existing Russian version of TDM's MasonAlternative font, this had a different origin than those Russian fonts processed by Riff_Keeper. Tels created this in 2012. He started from bitmaps of an ASCII Mason font, then used his Perl patch program to copy selected ASCII glyphs (that resemble in some way Cyrillic) to new font "MasonAlternative". See https://forums.thedarkmod.com/index.php?/topic/12863-translating-the-tdm-gui/page/15/#findComment-274617 In GIMP, he flipped or otherwise hand-edited to make them Cyrillic. He said, "There are still a few dozen missing, but this is enough to render the two headlines we have (New Mission and Setting)" https://forums.thedarkmod.com/index.php?/topic/12863-translating-the-tdm-gui/page/15/#findComment-274623 This accounts for the incomplete coverage. Speculatively, he took this approach because it couldn't find a Mason-style TTF font with both Russian characters and an acceptable license (e.g., public domain, or at the least freely redistributable for non-commercial use). @kalinovka,I wonder what the licensing is for your masonchronicles3.ttf.

May 29
14 replies

Font localization

Geep replied to kalinovka's topic in TDM Tech Support

Evidently a significant portion of the Cyrillic work was done by Keeper_Riff (in conjunction with Tels) back in 2011. These folks are not active in TDM these days. Keeper_Riff outlined a workflow, starting with FontLab to edit TTF files... https://forums.thedarkmod.com/index.php?/topic/12863-translating-the-tdm-gui/page/12/#findComment-271548 Specifically Carleton: https://forums.thedarkmod.com/index.php?/topic/12863-translating-the-tdm-gui/page/4/#findComment-262135

May 28
14 replies

Font localization

Geep replied to kalinovka's topic in TDM Tech Support

Yes, refont just makes it easier to make changes to a .dat file (via human-editable .ref file). To transform your .tga's into .dds, either - use a converter program (See https://wiki.thedarkmod.com/index.php?title=DDS_Creation_with_ATI_Compressonator) (recommended) use an image editor, most commonly GIMP (Photoshop will also work). You can read in and then export each file separately. But more typically, with GIMP, all the .tga's associated with one of the three TDM-supported font sizes are read in as separate layers, to a common GIMP project file (saved as an .xcf file). Ordinarily, you set only 1 layer visible. If you do any bitmap editing, the .xcf file becomes in effect the source master. Having all the files together as layers makes it easier to move a glyph (or copy parts of glyphs) from one layer to another. You use GIMP's Export feature (you specify the .dds extension up the top entry part to tell it the format). Be sure to generate mipmaps. See: https://wiki.thedarkmod.com/index.php?title=DDS_Creation_with_GIMP https://wiki.thedarkmod.com/index.php?title=Font_Bitmaps_in_DDS_Files particularly "Editing a Bitmap"

May 28
14 replies

Font localization

Geep replied to kalinovka's topic in TDM Tech Support

@kalinovka, I'm assuming your TTF font codepoints are those of Unicode. A conversion to Win1251 would use this map: https://www.unicode.org/Public/MAPPINGS/VENDORS/MICSFT/WINDOWS/CP1251.TXT The lower 128 are just ASCII, where Unicode and Win1251 use the same number. The upper 128 contain both the 90 codepoints you mentioned (starting with 1025 aka 0x0401), interposed with other characters. A proper rendering for TDM would include all these characters. It is possible that some old-school font editor already has an export profile for Win1251. As for hacking up a variant of ExportFontDoom3All256, you'd need to write a static array, holding Unicode values from CP1251.TXT. You could just do the upper 128 if you want, as shown next. Example: const static unsigned long UnicodeFor1251::array[] = { 0x0402, // 0x80 CYRILLIC CAPITAL LETTER DJE 0x0403, // 0x81 CYRILLIC CAPITAL LETTER GJE ... // more tedious or fancy editing here } Then the code loop would be something like [NOT TESTED]: // Export all characters. unsigned long sourceCharacterCode; for (int outputCharacterCode = 0; outputCharacterCode < Font::numCharactersToExport; outputCharacterCode++) { if(outputCharacterCode < 128) sourceCharacterCode = outputCharacterCode; else sourceCharacterCode = UnicodeFor1251[outputCharacterCode - 128]; bool okay = exportCharacter(sourceCharacterCode, outputCharacterCode); if (!okay) { std::cerr << "Error: Unable to export character " << getCharacterCodeString(characterCode) << "." << std::endl; return false; } } ... Further down in FontExporter.cpp, more changes... bool FontExporter::exportCharacter(unsigned long sourceCharacterCode, outputCharacterCode) // WAS single parameter characterCode { Doom3GlyphDescriptor* doom3GlyphDescriptor = 0; // Get the index of the glyph that represents this character. int glyphIndex = self.font->getGlyphIndexForCharacterCode(sourceCharacterCode); // WAS characterCode ... // Create a descriptor for the current glyph. doom3GlyphDescriptor = &self.doom3GlyphDescriptors[outputCharacterCode]; // WAS characterCode ... } After a successul export, there's still lots more testing and tweaking to be done, e.g., with datBounds, refont, if you want best character spacing and presentation. Also, TDM treats codepoint 0xFF specially, as mentioned in https://wiki.thedarkmod.com/index.php?title=I18N_-_Charset

May 28
14 replies

Font localization

Geep replied to kalinovka's topic in TDM Tech Support

@kalinovka, probably no quick solution. I imagine, with a font editor that reads/writes TTFs, you could relocate the Cyrillic down to the 0-255 range in a custom TTF, which could then be processed by ExportFont3All256. Or, if you know C++, you could make a variant version of ExportFontDoom3All256 with a different input range (both start and end) in the loop.* The wiki page contains a link to the source code for a Visual Studio build. In either case, you'd want to order the glyphs (or glyph processing) as Win-1251 (and TDM) expects, so the generated .DAT files would require minimal fixup. * Specifically, you'd start with FontExporter.cpp, and in function FontExporter::export_, change the loop indices of: // Export all characters. for (int characterCode = 0; characterCode < Font::numCharactersToExport; characterCode++) ... But if that was all it took, I'd be very surprised.

May 27
14 replies

Font localization

Geep replied to kalinovka's topic in TDM Tech Support

@kalinovka,you may have better luck with ExportFontToDoom3All256, described and available here: https://wiki.thedarkmod.com/index.php?title=ExportFontToDoom3 Beyond that, I haven't done anything personally with Russian fonts. Maybe the folks at DarkFate have further ideas about conversion from TTF to DAT/DDS (e.g., via some font editor?)

May 27
14 replies

Newbie DarkRadiant Questions

Geep replied to demagogue's topic in TDM Editors Guild

@BoilerDunce, in the past, I've sometimes had similar problems. Often, it's because something outside the perimeter was invisible, e.g., the center of an area light. That's usually because I forgot to turn off DR Filters. Or if you use DR Layers, similar potential for hiding. Other times, there really is some tiny crumbs or shards of brush or patches out there in space. Just really hard to see them. Don't strain your eyes too much. Just make a big box around the suspect "empty space", use DR "select inside" or "select touching", and delete. That can occur if, say, you import a prefab, then move it to where it needs to go while some DR filters are on, so you don't realize you're only moving part of it.

May 25
12050 replies
- 1

Visible Player Hands Mod

Geep replied to jivo's topic in The Dark Mod

Very nice additions. I particularly admire the throwing animations. The drinking ones are fun. I'd love to see the bottles come a bit closer/lower towards the mouth, and maybe a few drops splash upwards. The lock-picking is more problematic, and I'd probably like to see an option to enable/disable it. (This could be command-line, given that you're trying to get by with one GUI setting.) If the player only has 1 lock pick, does the animation still show both? If so, that's not good. But you could substitute a 3rd tool that is clearly different, e.g., a flat file or knife with a different color handle. The bigger problem is, as others have mentioned, even if the lock hole is clearly seen, the picks are off somewhere else. A thought. Suppose for a moment that your code knew where on the screen the hole was [a hard but usually plausible thing to determine, at least for standard assets]. Then maybe, rather than moving the player, your hands animation could be dynamically shifted left/right/up/down to hit it when it's in range. (More of the forearms/upper arms would have to be included in the animation, and the offscreen parts clipped, never screen-wrapped.) In any event, I'm looking forward to see how you animate the upcoming items you mentioned.

May 16
116 replies
- 2

Fan Mission: The Wizard's Treasure [15th Anniversary Contest]

Geep replied to thebigh's topic in Fan Missions

@Zerush, good to know about these. Related: in Sotha's "Skybox Tutorial", I removed the bad link to Terragen, and put in a little commentary about Terragen 4. But it requires someone who has actually used that tool (or an alternative) for an FM to really refresh that article.

May 9
51 replies
- 1

Fan Mission: The Wizard's Treasure [15th Anniversary Contest]

Geep replied to thebigh's topic in Fan Missions

Evidently a modern version of Terragen (now Terragen 4 and Terragen Sky) is available: https://planetside.co.uk/

May 4
51 replies

Sign In

Geep

Posts

Joined

Last visited

Days Won

Content Type

Profiles

Forums

Everything posted by Geep

Using AI for TDM Translations: An Exploration

Using AI for TDM Translations: An Exploration

Using AI for TDM Translations: An Exploration

Using AI for TDM Translations: An Exploration

Using AI for TDM Translations: An Exploration

Newbie DarkRadiant Questions

Fan Mission: Away 1 - Air Pocket, by Geep (2020/02/07)

Using AI for TDM Translations: An Exploration

Using AI for TDM Translations: An Exploration

Using AI for TDM Translations: An Exploration

Using AI for TDM Translations: An Exploration

Using AI for TDM Translations: An Exploration

Font localization

Font localization

Font localization

Font localization

Font localization

Font localization

Font localization

Font localization

Font localization

Newbie DarkRadiant Questions

Visible Player Hands Mod

Fan Mission: The Wizard's Treasure [15th Anniversary Contest]

Fan Mission: The Wizard's Treasure [15th Anniversary Contest]

Browse

Activity