Jump to content
The Dark Mod Forums

Geep

Contributor
  • Posts

    1217
  • Joined

  • Last visited

  • Days Won

    62

Everything posted by Geep

  1. Back to the on-going translation effort. Here’s some general observations and lessons learned while translating the main readable. Recap This readable, mentioned earlier, has an all.lang entry of: [English] // Readables in airpocket.xd: "#str_fm_airpocket_xd_sheet_appointment_to_service_pg1__title" "Appointment to Service\n" "#str_fm_airpocket_xd_sheet_appointment_to_service_pg1_body" "\n\nHEAR YE ALL, that the Bearer is found Fit to Serve, and appointed to the Rank of Ordinary Seaman, aboard the Merchant Ship Esmerelda, by Authority of the Shipmaster, Captain Riggs.\n\nThe Appointee serves Under the Direction & at the Pleasure of the Captain & Officers. By the Law of the Sea known by all, while Aboard, Violations of Duty will be Judged & Punished by the Captain & Those carrying out his Orders. Thievery, Mutiny, Assault, & Other Gross Breeches of Order will be met by Flogging or Death. At Successful Voyage End, the Appointee on Discharge will get the Agreed Wage recorded in the Ship's Log, less any Deductions.\n" "#str_fm_airpocket_xd_sheet_appointment_to_service_pg2__title" "Further Instructions\n" "#str_fm_airpocket_xd_sheet_appointment_to_service_pg2_body" "\n\n\n\nAs agreed, the 2 new crewmates will bunk in the galley, and share the sea trunk nearest the door in the mess.\n\nCaptain Riggs\n" As stated earlier, this text, stripped of #str_ IDs, tabs, and quotes, was given a long directive context prompt and fed into ChatGPT for forward-translation into all the TDM languages. For verification, individual language results were then back-translated (without additional prompt context) into English using Google Translate. Google Translate was also the preferred choice for spot patch-ups. Results were Good Overall Across languages, most sentences appeared to roundtrip-translate quite fine. Some languages (e.g., Turkish) needed more patch up, perhaps due to less AI training or just a greater sentence-structure differential. I did not stress-out about how well the officious over-capitalization of the original was carried over into the translation. Forward-Translation Problem #1 – Piecemeal Translation and Naming Consistency In order to keep this exploration tractable, at the outset I subdivided the collection of English #str_ into various groups (e.g, inventory items, character names (including shouldered), readable, briefing, etc.), and arranged to translate each group separately into all languages. The downside of this is that the subgroups end up with different names for the same thing. This is not unusual IRL, but not real desirable in a game. So, for instance, a key (inventory item) labeled “Mess” (for the door to the ship’s eating place) often has multiple correct nautical translations in a given language. When the word “mess” in the readable is translated, the AI may choose a different one. So patch-up is needed to make the references match. (Future projects might well benefit from a more holistic use of AI.) This problem also affects ranks and titles of persons, about which more next. Forward-Translation Problem #2 – Nautical Ranks and Titles left in English ChatGPT, when translating “...appointed to the Rank of Ordinary Seaman”, almost always left “Ordinary Seaman” in English. And would only sometimes translate “Captain” into a target language. (For a quick look at some suitable nomenclature, type this - “EU commercial nautical deck department ranks in all EU languages - into Google Search and see what the AI provides.) Furthermore, when “Captain” appeared in the readable as a title, e.g., “Captain Riggs”, ChatGPT liked to leave it unchanged. There are two ways to think about that: ChatGPT thinks “Captain” is a first or nickname. ChatGPT thinks that the rank of Captain (and perhaps the surname) implies this is an Englishman, so the rank should be left as conferred by the governing authority. It would be easiest for me to just leave it as “Captain Riggs” throughout. But I’m not really trying to imply Riggs is English, and would think translated immersion would be better if the title was also translated. (I haven’t yet decided which way to go.) Forward-Translation Problem #3 – Lost Nuance affecting a Proximity Clue The last sentence of the readable refers to “the 2 new crewmates ... share the sea trunk nearest the door in the mess.” The English implies that this trunk is *within* the mess. In retrospect, it would have been better to phrase it that way (perhaps I will in the end product’s [English]). Translations into some languages kept this implication; others just said something like the trunk was “near the mess door”, so could be outside the mess. This is a bigger problem than you might think, since the areas involved are underwater and chaotic, so search is difficult. I will do patch-up retranslation where needed.
  2. By themselves, yes. Not with a caption, e.g., a bale of hay with the caption "hay". Within an overall area captioned "Smells" I agree feelings, touch, probably tastes are better off text-only.
  3. Engaging concept. For smells, which in your demo tend to predominant and persist, it might be less distracting to use icons+captions, a la the style of inventory items. Maybe, say, a column of these at the left edge. Constant smells would just be opaque, but a likely or possible smell more translucent and/or with pulsing, fade in/out.
  4. You would hope that even if an algorithm raises a flag, a person would still review it, knowing that algorithms are imperfect. I agree that the EU provides way more privacy protection than the US. At least from third-party data brokers... don't know about from governments. I recall that researchers are working on "distilled" or other forms of light-weight AI that can run locally on your PC or phone, without needing the cloud and attendant privacy concerns. Translation is an obvious use case for this work, so some reason for optimism there.
  5. By "IP", I'm thinking you mean my machine's address (not intellectual property). Yes, if an AI was not privacy-preserving, and actively monitoring for and reporting IRL criminal activity (likely many are), and misconstrued what I was doing, then I see your point. For my particular FM, the phrases don't really imply any criminal acts, just clues. This is not the case for all FMs. In those cases, when untrustworthy AI systems allow rich prompts, it's probably a good idea to specifically indicate you are writing for a fictional mission within a PC game. Also, the activity itself, translating into a fixed suite of multiple languages, is probably not something an actual criminal enterprise would do. Talking in the forums about your FM under development would also show that your intentions are not real-world crimes, and I'd like to think ward off all but the most wrong-headed prosecutions.
  6. Nice to hear about this. That's an impressive roster of languages... but there's always the question, how good is it for translation to the particular language(s) one needs? Privacy-centered is great too, although perhaps less important for happenings in the fictional TDM world. I was also looking today at Anthropic's Claude Haiku 4.5 announcement, which seemed to suggest that the free tier includes access to it through its API. (In the long run, I can see driving forward/backward translations through different model APIs.) Haiku is claimed to be a small but powerful AI model. A general concern about making models small is that they are less knowledgeable, so maybe less good translations. So many AI systems, under such swift change. In the meantime, I'm taking a slow, cautious, and somewhat piecemeal approach, seeing where the translation problems are, and how to fix them.
  7. I see a problem that's starting to bite me. I probably should have including some prompting to steer towards European dialects of languages, rather than relying on defaults. The latter for ChatGPT is said to be Brazilian Portuguese and "Latin American" Spanish (neutral, but leaning towards a mix of Mexican, Cuban, Venezuelan). The defaults dialects had more training data and so will be more accurate. But European (e.g., Castilian Spanish and Portugal's dialect) would better reflect TDM's predominant audience and the game's historic time period. Don't know that I really want to redo for this, tho. And TDM's language choice doesn't mention dialects.
  8. Going to resume the paused Air Pocket back-translation work now.
  9. Cyrillic Mason Released This project is wrapping up here, with hand-off for incorporation into a 2.14 beta release. For TDM’s Mason 48pt font, within the /russian/mason/ and (as a clone) /russian/mason_glow/ scope, it provides full coverage of all the TDM-supported Cyrillic characters. Characters may be seen up-close within this FM: testMasonLora3Way.pk4 of October 1. from which 2 representative overall screenshots are shown here. For further details, besides running the foregoing test FM and viewing what it says in the briefing, see also: 1) The bugtracker post 0006642: Extend Mason Font to All Cyrillic Characters which will be updated momentarily to indicate that other admins may begin the process of incorporating it into 2.14 beta releases. 2) The doc “Tested Changes to Russian Mason Font, Proposed for TDM 2.14, Oct 14 2025”. This 10-page Word doc first touches on the project background (for those not following previous posts in this forum), then talks in details about implementation aspects (mainly the successful ones) with their pluses and minuses. 3) The text file “fontimage_48[mason, Sept 30 2025].ref”, used to generate the corresponding final .dat file. It itemizes all 256 ASCII+Cyrillic characters, with annotations for this project. 4) The wiki’s Mason Font article now reflects a summary of these proposed changes.
  10. Lips they flap, but the dubbed voices go their own way. YouTube doesn't have access to the underlying character 3D model, so can't easily adjust lips to match dubs. Tho some artificial intelligence systems for movie dubbing do that now, so I guess it's a matter of time for YouTube too.
  11. I just added a bugtracker ticket about this work: 0006642: Extend Mason Font to All Cyrillic Characters For the benefit of that post, let me add a summary of the current situation. Current Workplan Originally, TDM's Mason was likely generated from Mason Alternate TTF, which has no Cyrillic coverage. Subsequently, some Cyrillic characters needed for TDM's main menu were presumably hand fabricated. To avoid further tedious hand-work, the strategy is to use just-developed program ExportUnicodeFontToDoom3 (see wiki) to harvest Cyrillic characters into supplemental DDS files and corresponding DAT. Source TTFs that have the required Cyrillic glyphs were reviewed. A distinctive aspect of Mason Alternate's style is that angular lines (like the sides of "A") are terminated by serifs that are stroke-perpendicular instead of horizontal. Other reviewed TTF fonts, including Regular Mason (and clone MasonChronicles) do not feature that. Thus, Mason TTF is an okay but not precise match to the Mason Alternate style. It is good match for some characters, but less for others, where it substitutes curved lines for angular ones. It has unpromising licensing. Of TTF fonts with Cyrillic coverage and open-font licensing, Lora has an adequate (though not precise) stylistic match, with the desired angular lines. Lora coverage and character style were shown in a screenshot earlier in this thread. The current workplan is to have a 3-way merge of existing TDM /english/ Mason (which has crisper ASCII characters), existing /russian/ Mason, and newly generated Lora characters. The existing TDM Mason font has some hand-drawn Cyrillic characters, of variable quality, that may be stylistically preferred over Lora. Lora characters will be scaled and custom-aligned as appropriate. Kalinovka and Geep have a private DM going, including a spreadsheet to manage decisions about individual characters.
  12. I did originally consider using either a key binding (hard to come by) or a button. The button would seem to require a lot of iteration and per-readable customization to find a location where it's not blocking the text. Plus i18n for its label and complex treatment of when to hide and show it (or in some cases toggle the text). I got tired just thinking about it. So while I don't deny that player control is good, my more-limited version I feel is more-practically implementable with a reasonable amount of core-coding and GUI-hacking work for 2.14 or 2.15
  13. Links for Release 1 of ExportUnicodeFontToDoom3 are finally posted on the wiki.
  14. testLora Released (Aug 30 version - run4) testLora.pk4 This shows an experimental FM, run 4 of testLora. It's certainly not the final product. The idea is to generate the Cyrillic characters for /russian/mason/ from a TTF font that has them (unlike MasonAlternate, the traditional TDM "mason" font), but is not too unlike MasonAlternate in style. And has a license we can live with. During generation, scaling was done (by using ExportUnicdoeToDoom3 with 45pt as a stand-in for "48pt"). As the screenshot shows, this makes the Lora lower case characters the same height as MasonAlternate lower case. However, the Lora upper-case characters, unlike MasonAlternate are not yet scaled up by 120% nor top-aligned. (Nor will they be at this point in the experiments.) The final product will likely have complex 3-way character sourcing, from MasonAlternative ASCII, some MasonAlternative Cyrillic, some Lora Cyrillic. Restating the briefing: Here's a mockup TDM character set, within-world, for a mixture of Mason and Lora fonts for Russian characters. This version of the FM has - - at codepoints 0x00-0x7f, the TDM 2.13 Mason ENGLISH characters. Most of these are crisp due to enlarged bitmaps. - at codepoints0x80-0xff, there's freshly generated Lora 45pt (passing as Mason 48pt) for Russian character set. Make sure TDM's language is set to Russian. On a room's floor, all cp1251-defined (and thus DAT-defined) printable glyphs are laid out, to quickly reveal missing & bad characters. They are shown twice, to evaluate 'stray marks' (due to bounding box overlap of neighbor glyphs) & vertical/horizontal 'spacing' [shown in screenshot]. View it all from the ledge (or noclip), or walk the floor for close inspection. No special top-alignment or per-character scaling yet. If used in an FM (as here), this font will not include 'glow' enhancements.
  15. Maybe. I try to ideally only rotate once from a grid-aligned object, to minimize problems. So in your case, have a grid-aligned fence segment from which you copy and rotate each segment separately. Or you could temporarily go to a rather fine grid if you have to do cumulative rotations. But in my experience that sometimes can cause other problems... objects in question disappearing, mysterious lighting artifacts.
  16. I'd vote it NO. I'm thinking that a "save game" just copies the process's memory and registers into one big mostly unstructured blob file, and "load" just reverses that process. I don't see anyway for DR to work with that.
  17. testMason FM Released (Aug 24 Version) testMason.pk4 This FM shows the complete character set of the Mason font on a floor in-game. It is mainly to support our on-going /russian/ work (but can be used with /english/ as well). The screenshots show the Cyrillic font, when the user selects the Russian font setting. One shot shows the TDM 2.13 situation. Note in particular the clobbered ASCII characters around "D". The other shot shows some 2 dozen "phase1b" DAT corrections I've done, to improve existing characters. There's plenty more work to be done by me & kalinovka from 0x80 on, to replace missing or non-compliant glyphs with new stuff. The FM displays the characters set twice, as a "Stray Marks" test seen fully in the screenshots, and an adjacent "Spacing" test. The two tests show the same character subsets in each aligned row. The row headers are in hex for "Stray Marks", in decimal for "Spacing". See the FM briefing for more details.
  18. No. Some lousy alternatives if you really, really need this: maybe fake something by using the overlap of title and body text. Or make a readable with a custom background with strikeout (or text plus strikeout) burned in.
  19. I made some initial progress on the back-translation AI verification work, but I'm pausing that for a little while to help with Mason font work. I see that ChatGPT just released version 5. Claimed to be more reliable. Time will tell.
  20. All - I've started a wiki article about the new-variant command-line program ExportUnicodeFontToDoom3 This will likely be completed and the program download link posted around the weekend.
  21. Probably we'll have to do this mostly asynchronously, due to time zone differences. I'm thinking the spreadsheet would begin with these columns, with values imported from rows 0x80-0xff of the cp1251.txt file: TDM CodePoint (8-bit hex) Unicode Value (16-bit hex) Unicode Name (string) There's a row or two where Russian diverges from cp1251, and will need hand-editing for that. Then we have the decision column: New glyph? (initially blank, then with values N, Y, and ?) And why... Comments We could go crazy with detail columns, but maybe just put stuff into Comments will be enough. I'll DM you a link to a first draft if I'm successful in creating it and populating the first 3 columns. EDIT: Link sent
  22. Probably we need to setup a shared spreadsheet to track status and decisions on each letter. I haven't used Google spreadsheets before (only Google docs & Excel), but could probably set one up, initialized with cp1251.txt data. If that's of interest to you.
  23. I might not have been clear, that the link in the purple text above goes to standard cp1251.txt file that you need to edit.
  24. Instead of creating the list with the format I mentioned earlier, it would now be more helpful to use a slightly different format, based on your custom edit of Unicode.org's "Format A" for cp1251.txt The draft upcoming wiki page for ExportUnicodeDoom3, that explains this, begins... Introduction This 2025 offshoot of ExportFontToDoom3All256 is in response to a [request to help – link] extend TDM’s Mason 48pt font to include missing Cyrillic glyphs. The main idea is that the input TTF font would go beyond those that support just ASCII or ANSI, to include those that offer additional Unicode characters. It still limits output to a maximum of 256 characters, consistent with the DAT format. To achieve this, it reads an external “unicodeMap” file, in [Unicode.org’s “Format A” – link]. This provides the mapping from the traditional 8-bit ISO or Windows encoding that TDM uses (e.g., cp1251 for Cyrillic) to the corresponding UCS (Unicode 16-bit) value. As discussed below, you can edit this file in advance, if you want to generate just a subset of glyphs. New Command Line Arguments (in Addition to those of ExportFontToDoom3All256) unicodeMap Example: -unicodeMap "./Test/cp1251.txt" You can edit the unicodeMap file in advance, to specify which particular glyphs you want to generate by suppressing unwanted glyphs, either by: Deleting an unwanted line (or block of lines) entirely, or Prepending a “#” to comment-out the line. Also, this file format allows the Unicode value (e.g., “0x1234”) to be replaced by 6 space characters, when the ISO or Windows standard leaves that character undefined. In all 3 of those cases, every suppressed character in the output DAT file will be represented by the glyph that the font uses for U+0000. A hollow box is common. [additional arguments planned] So in my testing so far, I created a "cp1251upperhalf.txt" file, with all the lower-half ASCII lines deleted. Seems to work with some random Unicode font (Lucida Sans Unicode). Beyond that, for your work, probably commenting out individual lines with a starting "#" would allow more flexibility.
  25. That discussion earlier about doubling some Mason dds to 512x512... I got a little confused. That applies to only the 'english' set (that includes Cyrillic glyphs too), not the 'russian' set, which is all 256x256.
×
×
  • Create New...