Hi,
ScummTR maintainer here. Thanks for this very interesting set of questions, I'll try to help you as much as I can.
(Although posting questions here mean that they will be properly indexed by Google, it's probably better to ask your question on
the ScummVM Discord server, which has much more visibility and activity than the forums nowadays. Also, for ScummTR itself, opening
a new discussion on the project is also recommended.)
In general, regarding ScummTR usage:
- Make sure you're using the latest ScummTR release -- some people still use the scummtr.exe program from 2003/2004, but newer releases with some bugfixes were made after 2020
- Have a look at its FAQ and manual pages; some common questions are referenced there (the FAQ can be improved, I just need more feedback or contributions for that)
- Also have a look at NUTCracker, which is a newer, more-maintained alternative to ScummTR, and which lets you do more SCUMM resources changes than ScummTR/ScummRP/ScummFont. (Currently, it also supports Humongous Entertainment titles, but doesn't support yet all the pre-Monkey 1 CD LucasArts titles that ScummTR supports).
So, now, back to your questions…
1. What are the long escape sequences before most of the strings? Do I need to bother with them?
I imagine some kind of ID and / or screen position data?
As LogicDeLuxe said, in the cases you've shown, they're special options to trigger the audio lines spoken when the text is displayed. You don't need to bother with them, and more importantly
you shouldn't change these escape sequences at all, otherwise you risk "breaking" the audio lines.
The basic idea is: "if a line
starts with a long series of escape sequences in a talkie title, leave this escape sequence the way it is".
(It's -- a bit -- covered in
this part of the FAQ, but I guess I could improve it in some way.)
2. What is the \016 escape sequence often instead of the last space?
Perhaps a line break indicator, although some testing didn't seem to confirm that.
It's a special typography feature which, as far as I know, only exists in the official releases of Indy4. It's a way of having a non-breaking space, basically. It's very useful in some languages (e.g. French), and it can also help you make sure that a newline will never be inserted between two words.
It looks like whoever dealt with the typography in Indy4 disliked
"runts".
If you don't care about this, you can just replace the \016 with a plain space character. Or do some texts in-game to see what it does (i.e. write a long sentence, and have Indy read it while being close to the edge of the screen).
On the other hand, if you
do like non-breaking spaces, it's possible to import this special character into most of the other SCUMM titles. So far, only French people have asked me so
Speaking of "line break indicator", have a look at the
FAQ explanations for the \255\001 (and so on) sequences.
3. Do I need to update the string lengths somewhere?
No, that's the first purpose of ScummTR. Before this tool existed, people would decipher the .LFL files by hand (it's trivial) and edit the strings with a hex editor (
here's a very old example of such older translation; it's full of typos, hacks and deep bugs -- ScummTR was created by Hibernatus as a response to this for the ATP team that was created for this translation, AFAIK).
You don't have to care for the string lengths (except for the '@' symbols used for padding objects/actors/verbs when they get renamed; see below). You can't use ScummTR to
add or
full line, though. (Some translations sometimes require
adding new strings, and this becomes more tedious, because it's not implemented in ScummTR itself. Technically, this feature could probably be added, but I'm "only" maintaining it, not really developing it (its original author is not really interested in it anymore either, 20 years later), and I don't have the interest/skills in making any big change to it.)
4. When using the translations, using the in-game save crashes the game. ScummVM state save and load work fine, but they don't reload the translations, (I think?)
Yes, that's expected. That's because ScummTR has to change the resource sizes for the newer string lengths and newer object/actor/verb names, and the saves do not expect this to happen.
This part is covered in
this part of the FAQ.
When you're still working on your translation, I'd recommend playing with
Boot Params, instead. It's what the original developers of the games used to play the game at various places, while working on it. Boot Params are
not affected by the problems that you see with saves.
Once your translation is mostly done, it's probably a better moment to start doing a full play with saves (although they'll break again if you make new changes). And once your translation is completely frozen, of course the saves are not going to cause this problem anymore.
5. Can/should I also translate things like
which look like headers the game might look for?
No, leave them alone. I don't think there's an easy way for ScummTR to recognize them (otherwise we could just hide them). You need to know the context of the script containing these strings, just to be sure.
When in doubt, if you can't even get the string to be displayed in the game, just leave it as-is.
If you want to read the scripts themselves (it's kinda useful when you do a translation, in general), you can try using something like
ScummEX, for example. Give it the ATLANTIS.001 file, and then explore the "rooms" contained inside each LFLF you see. Some resources can be decompiled (behind the scenes, it calls descumm.exe), e.g.
'LSCR', 'SCRP', 'EXCD', 'ENCD', 'VERB'. Then, hit the "Decompile Script" button to see what the script looks like.
If you do your ScummTR import/export calls with the '-h' option, you will see this kind of header, at the start of each line:
Code: Select all
[006:LSCR#0200]These books don't look\016familiar.
The first number is the LFL/room number. Then it's the resource type, and then resource number.
This way, when you encounter an unknown string, you know which script needs to be decompiled in order to know a bit more about its context.
(I see you mention GDB, so you're maybe quite familiar with the command-line interface. In that case, you can install scummvm-tools, run scummrp to extract all the resources, and manually run 'descumm -5' on the resource types given above. Or use
NUTCracker for this.)
6. When dealing with @-padded strings, how do I know that
and
are of the same 'group' of fixed-length strings?
I've tried to keep strings that are similar and close together of the same length using @-padding, but not rigidly. So far, no apparent issues with memory corruption.
One way is to look at the context of the script, as described in the previous answer.
Another way is to not bother padding strings yourself, and just use the recommended `-A ao` option, in your export/import. It will just pad the strings for you. It does so by padding a lot more than required, but unless you want to play your translation on an original DOS machine from 1990 with its very limited memory, it shouldn't matter
Note that ScummVM doesn't care for strings not being properly padded, anyway. The original interpreters
do care for this, though. You'll hit runtime script errors, if some strings don't have enough padding (actually, the official French release of Indy4 had such a fatal error…). If you care about being compatible with the original interpreters, I'd suggest doing a full gameplay of your translation with an original interpreter running under DREAMM or DOSBox.
7. Do these translations already exist somewhere? If not, I'd be happy to contribute it once I'm done.
I'm not sure I understand your question, here.
You mean: do other fan translations exist? Yes. For Indy4 in particular, well, it requires quite a lot of work, because it has so much text, so I'm not sure so many of them exist
I'm not really aware of a catalog of all the available translations. They can be added to the ScummVM detection tables, if they're completely done, and if and only if they're NOT distributed as full copies of the games.
8. How do I find the button caption offsets in the binary? Translated caption strings are of different length, and one would like to center them again.
Ah, yeah, that one. It's a hardcoded image. My notes say 'object #1263 in room 98'. You should be able to export/import it with NUTCracker, and some MS Paint usage
. Or grab the resource from the official German/French releases of Indy4, where it has already been redrawn for larger verbs.
9. Is there a way to construct the text graphs from the output of ScummTR? That way, I could just glance over the conversations to gather the tone and exact meaning. Without context, things like "Well, now", "Hold on", "Come on" and many other shorter phrases just become impossible to accurately translate. Also, in my case, Dutch distinguishes formal from informal 'you', and it would make the whole translation so much better if I knew who was talking to whom.
Use the '-h' option and follow the procedure above to read the strings within their script context. It helps a lot.
And have fun with your translation