Search the Community
Showing results for tags 'ai vocal script'.
-
Hi folks, and thanks so much to the devs & mappers for such a great game. After playing a bunch over Christmas week after many years gap, I got curious about how it all went together, and decided to learn by picking a challenge - specifically, when I looked at scripting, I wondered how hard it would be to add library calls, for functionality that would never be in core, in a not-completely-hacky-way. Attached is an example of a few rough scripts - one which runs a pluggable webserver, one which logs anything you pick up to a webpage, one which does text-to-speech and has a Phi2 LLM chatbot ("Borland, the angry archery instructor"). The last is gimmicky, and takes 20-90s to generate responses on my i7 CPU while TDM runs, but if you really wanted something like this, you could host it and just do API calls from the process. The Piper text-to-speech is much more potentially useful IMO. Thanks to snatcher whose Forward Lantern and Smart Objects mods helped me pull example scripts together. I had a few other ideas in mind, like custom AI path-finding algorithms that could not be fitted into scripts, math/data algorithms, statistical models, or video generation/processing, etc. but really interested if anyone has ideas for use-cases. TL;DR: the upshot was a proof-of-concept, where PK4s can load new DLLs at runtime, scripts can call them within and across PK4 using "header files", and TDM scripting was patched with some syntax to support discovery and making matching calls, with proper script-compile-time checking. Why? Mostly curiosity, but also because I wanted to see what would happen if scripts could use text-to-speech and dynamically-defined sound shaders. I also could see that simply hard-coding it into a fork would not be very constructive or enlightening, so tried to pick a paradigm that fits (mostly) with what is there. In short, I added a Library idClass (that definitely needs work) that will instantiate a child Library for each PK4-defined external lib, each holding an eventCallbacks function table of callbacks defined in the .so file. This almost follows the idClass::ProcessEventArgsPtr flow normally. As such, the so/DLL extensions mostly behave as sys event calls in scripting. Critically, while I have tried to limit function reference jumps and var copies to almost the same count as the comparable sys event calls, this is not intended for performance critical code - more things like text-to-speech that use third-party libraries and are slow enough to need their own (OS) thread. Why Rust? While I have coded for many years, I am not a gamedev or modder, so I am learning as I go on the subject in general - my assumption was that this is not already a supported approach due to stability and security. It seems clear that you could mod TDM in C++ by loading a DLL alongside and reaching into the vtable, and pulling strings, or do something like https://github.com/dhewm/dhewm3-sdk/ . However, while you can certainly kill a game with a script, it seems harder to compile something that will do bad things with pointers or accidentally shove a gigabyte of data into a string, corrupt disks, run bitcoin miners, etc. and if you want to do this in a modular way to load a bunch of such mods then that doesn't seem so great. So, I thought "what provides a lot of flexibility, but some protection against subtle memory bugs", and decided that a very basic Rust SDK would make it easy to define a library extension as something like: #[therustymod_lib(daemon=true)] mod mod_web_browser { use crate::http::launch; async fn __run() { print!("Launching rocket...\n"); launch().await } fn init_mod_web_browser() -> bool { log::add_to_log("init".to_string(), MODULE_NAME.to_string()).is_ok() } fn register_module(name: *const c_char, author: *const c_char, tags: *const c_char, link: *const c_char, description: *const c_char) -> c_int { ... and then Rust macros can handle mapping return types to ReturnFloat(...) calls, etc. at compile-time rather than having to add layers of function call indirection. Ironically, I did not take it as far as building in the unsafe wrapping/unwrapping of C/C++ types via the macro, so the addon-writer person then has to do write unsafe calls to take *const c_char to string and v.v.. However, once that's done, the events can then call out to methods on a singleton and do actual work in safe Rust. While these functions correspond to dynamically-generated TDM events, I do not let the idClass get explicitly leaked to Rust to avoid overexposing the C++ side, so they are class methods in the vtable only to fool the compiler and not break Callback.cpp. For the examples in Rust, I was moving fast to do a PoC, so they are not idiomatic Rust and there is little error handling, but like a script, when it fails, it fails explicitly, rather than (normally) in subtle user-defined C++ buffer overflow ways. Having an always-running async executor (tokio) lets actual computation get shipped off fast to a real system thread, and the TDM event calls return immediately, with the caller able to poll for results by calling a second Rust TDM event from an idThread. As an example of a (synchronous) Rust call in a script: extern mod_web_browser { void init_mod_web_browser(); boolean do_log_to_web_browser(int module_num, string log_line); int register_module(string name, string author, string tags, string link, string description); void register_page(int module_num, bytes page); void update_status(int module_num, string status_data); } void mod_grab_log_init() { boolean grabbed_check = false; entity grabbed_entity = $null_entity; float web_module_id = mod_web_browser::register_module( "mod_grab_log", "philtweir based on snatcher's work", "Event,Grab", "https://github.com/philtweir/therustymod/", "Logs to web every time the player grabs something." ); On the verifiability point, both as there are transpiled TDM headers and to mandate source code checkability, the SDK is AGPL. What state is it in? The code goes from early-stage but kinda (hopefully) logical - e.g. what's in my TDM fork - through to basic, what's in the SDK - through to rough - what's in the first couple examples - through to hacky - what's in the fun stretch-goal example, with an AI chatbot talking on a dynamically-loaded sound shader. (see below) The important bit is the first, the TDM approach, but I did not see much point in refining it too far without feedback or a proper demonstration of what this could enable. Note that the TDM approach does not assume Rust, I wanted that as a baseline neutral thing - it passes out a short set of allowed callbacks according to a .h file, so language than can produce dynamically-linkable objects should be able to hook in. What functionality would be essential but is missing? support for anything other than Linux x86 (but I use TDM's dlsym wrappers so should not be a huge issue, if the type sizes, etc. match up) ability to conditionally call an external library function (the dependencies can be loaded out of order and used from any script, but now every referenced callback needs to be in place with matching signatures by the time the main load sequence finishes or it will complain) packaging a .so+DLL into the PK4, with verification of source and checksum tidying up the Rust SDK to be less brittle and (optionally) transparently manage pre-Rustified input/output types some way of semantic-versioning the headers and (easily) maintaining backwards compatibility in the external libraries right now, a dedicated .script file has to be written to define the interface for each .so/DLL - this could be dynamic via an autogenerated SDK callback to avoid mistakes maintaining any non-disposable state in the library seems like an inherently bad idea, but perhaps Rust-side Save/Restore hooks any way to pass entities from a script, although I'm skeptical that this is desirable at all One of the most obvious architectural issues is that I added a bytes type (for uncopied char* pointers) in the scripting to be useful - not for the script to interact with directly but so, for instance, a lib can pass back a Decl definition (for example) that can be held in a variable until the script calls a subsequent (sys) event call to parse it straight from memory. That breaks a bunch of assumptions about event arguments, I think, and likely save/restore. Keen for suggestions - making indexed entries in a global event arg pointer lookup table, say, that the script can safely pass about? Adding CreateNewDeclFromMemory to the exposed ABI instead? While I know that there is no network play at the moment, I also saw somebody had experimented and did not want to make that harder, so also conscious that would need thought about. One maybe interesting idea for a two-player stealth mode could be a player-capturable companion to take across the map, like a capture-the-AI-flag, and pluggable libs might help with adding statistical models for logic and behaviour more easily than scripts, so I can see ways dynamic libraries and multiplayer would be complementary if the technical friction could be resolved. Why am I telling anybody? I know this would not remotely be mergeable, and everyone has bigger priorities, but I did wonder if the general direction was sensible. Then I thought, "hey, maybe I can get feedback from the core team if this concept is even desirable and, if so, see how long that journey would be". And here I am. [EDITED: for some reason I said "speech-to-text" instead of "text-to-speech" everywhere the first time, although tbh I thought both would be interesting]
- 24 replies
-
- 3
-
So, already a few weeks ago, I started writing a Pagan vocal script concept. I took the master template for new VS-s, wrote down ideas already brainstormed on the side, then read closely through the VS to see where some would fit, and what other lines I could insert and expand upon. I try to keep the lines short and snappy, but interesting enough. Before I even started, I read all the past discussions on this topic, made notes of other peoples' existing suggestions. Just to have a point of comparison, both for what I might try for my script, and for what I wouldn't (because it would not fit tonally). Much has been discussed on how to avoid making the Pagan characters sound like caricatures, either in speech style, or in overdone references to nature/deities/etc. They're not meant to be "New Age tree-hugger hippies", they're meant to be realistic-sounding individuals. As The Dark Mod's lore includes nuances such as Pagans being not only some yet-untamed "barbarian" tribesmen outside of the Empire, but also some hidden Pagans among the commoners in cities/towns and villages/rural hamlets, I had to account for that while putting together the script. You'll see more of my rationale once I expand this post in the near future, when I have the concept script fully ready. For the time being, let's just say I tried to avoid too many overt references to nature and pre-Builder folk religion. Ergo, as it wouldn't make sense for an urban Pagan from the City's narrow alleys and slums to talk about, e.g. mighty stags on a forest meadow (or something like that), I try to make any and all nature references more down-to-earth and subtle. Example, AI alerted and searching for a hiding tresspasser: "Where have you scurried to, little mouse ? Where, oh where, have you scurried to ?" No diminutives, no plant and magic references in every second word, but you still have this vague indication the guy in question might be more of a nature-worshipper in private than a Builder monotheist. (Not that Builder-faithful wouldn't have an appreciation for nature, it's just that the views of it would differ somewhat, on a psychological/cultural level.) There are some lines about spirits or natural forces and so on too, but most of the other lines are such that they could work for any commoner in a rural or urban setting. ---- This entire vocal script concept is readable below. For the sake of quicker readability, I have divided the entire overview with the use of spoilers, based on the sections of the vocal script. ---- BASIC INFORMATION ON THE VOCAL SCRIPT ("Pagan male / Tribesman") AI STATES: Relaxed AI STATES: Alert These barks are meant to tell the player that the AI has seen or heard something. AI STATES: Searching COMBAT AND PURSUIT FINDING EVIDENCE You have found or observed something that looks out of place. You aren't seeing or hearing the intruder directly, but something that might be a sign that one was here earlier. Greetings Since greetings can be made to friends or strangers, delivery should be fairly neutral. Also, there's no way to know whether the AI have seen each other once or twenty times before greeting each other, so typical "hello" greetings should be limited in favour of casual comments or questions that can be answered 'yes'. Greetings are not exchanged between sitting characters, so assume that the greeting is a quick one as AI pass each other. Not every greeting is needed for every vocal character. The thug, for example, has special greetings for female characters because he's a sexist pig, but not every character needs those. ---- Feedback ? I'd like to ask you to provide your own constructive criticism now. Feel free to provide feedback on the lines, try to give me constructive criticism on what could be improved, added, dropped, changed. I'm all ears. Sink your nitpicky teeth into this vocal script proposal. Final note Besides this particular VS, I also have one/two more in development, and I plan to start work on them soon.
- 19 replies
-
- 3
-
- ai vocal script
- character vocal script
-
(and 2 more)
Tagged with: