Many Microsoft Word documents are not made ready for translations. We typically deal with Word formatting that is optimized for the source content. When we prepare Word documents for translation, we anticipate where we would be spending a lot of manual time fixing formatting issues in the translation. Most of these issues become apparent from working in our Translation Memory where text is extracted and segmented based mainly on the paragraph marks in Word. Text gets put in a bilingual format (source > target) and when the translation is completed, the software replaces the source strings with the corresponding target translations. Because we work with the source text and format as it is provided, we need to look ahead and anticipate any formatting issues that may arise which incurs additional time on a project as well as costs. Here are some Word formatting best practices:
Working with Styles and Paragraphs
We recommend working with styles because it allows you to adjust 95% of the formatting based on common content structures, such as body text, headers and format in tables. Every formatting adjustment that you can make to text is contained within that one style. So, if you want to condense the font slightly across the board in the body text, but not in the headers, the style adjustment would take care of that. This does require some up-front setup but once you have your styles defined, the formatting process gets to be very consistent and very easy.
Another good practice in preparing Word documents is to be consistent with paragraph formatting. Word typically breaks up lines in 3 different ways:
- It flows automatically to the next line when it hits the (page) margin,
- It is broken by a hard return (enter) or
- it is broken by a soft return (shift+enter).
Soft returns are often used to force a break on a line without moving it to a next paragraph. We see soft returns used a lot and our translation memory software can handle soft returns well (so that the translator working in the translation memory software sees a full sentence in one segment and not broken up into 2 as would happen with a hard return), but we typically ignore them in translation because it is never possible to entirely know where these will fall. If soft returns or hard returns are used to break up sentences to fit around a specific shape (like a call-out box or an image), we typically try to let the text automatically flow. Text can easily be wrapped around objects but it’s often ignored. Instead, on files we receive for translation, we see text being wrapped manually using hard or soft returns. In those cases, we let Word wrap the text automatically around the object because it will never never fit exactly in another language. Text can get hidden behind text boxes or images without ever knowing for sure unless it is reviewed carefully by the linguist. Therefore, it’s better to always make sure text boxes and other objects wrap text around it.
Tables and Tabs
Tables solve a lot of issues in translation. Tables are a neat way to organize information across the width of a page. Perhaps one of the reasons why tables are not often used for anything other than strict tabular data is that you need to format tables to look like regular text. Typically, when text does not need to look like it belongs to a table, it seems easier to format it using tab or space characters for spacing. However, this manual spacing goes off when we deal with longer words or shorter words in translation.
Whenever you deal with distribution of text that acts like a table format, it’s best to keep it in a table format. Because source copy that fits on a single-line will often turn into a double-line in translation, a table will handle the format. Another thing we see often is that a table layout set with spaces or tabs has broken text (see image). Column 1 starts with text, then a tab character, then column 2 text and after a line break we start with text under column 1 again. When processed, this is all broken text out of context and cannot be translated correctly.
Tabs sometimes are helpful when you need to create space between words (like to create fill-out lines). You can make use of tab spacing markers in Word to lock in the width of a tab. If you have a form that has a lot of fill-out lines, consider using tab characters that you control with a marker in the ruler.
Working with Auto-Generated Data and Field Codes
There are several elements in Word documents that can be automatically generated, such as a Table of Contents. A Table of Contents is generated from picking up Header styles. We see a lot of Table of Contents being customized manually to fit certain needs. This could include adding or taking out information, modifying the format of the Table of Contents such as indentation or page number formats. However, any of this manual work needs to be replicated in the target language. This can add time in comparing Table of Content output between a source and target language. A better approach is to work with Table of Content styles and field codes to add exactly what you want from a Word document.
There is a tremendous amount you can do with fields in Word. We worked once on a book that was formatted in Word. Each even page had the title of the book while the odd pages had the title of the current chapter. It’s easy to manually adjust this format based on how the translation paginates on a 5 page document, but in a 160 page document this can become very tedious. And it’s unnecessary, because you can work with fields in Word that automatically generate a title or can grab the Header 1 style of the current section. Plus, you can set it up so that the header alternates per page. It takes a bit of time to figure out, but it’s a guarantee that we would want to automate it in English first before we go into translation.
The Word Formatting Mantra
Always keep in mind in translation: Format once, re-use often! More often than not, you can let Microsoft handle the flow of text if you set it up correctly. Whether that is using table formats or proper paragraph formatting or making use of fields, consider the potential need for translation. A good translation partner will look out for you and make sure that any potential formatting issues are dealt with ahead of time. However, if your organizations is putting out a lot of Word data, consider working on some process improvements and look at what we call “Language & Design Readiness.” Word Formatting should never be a standard 10% add-on. It can sometimes take a few hours to prepare a Word document for translation of the same length where other documents may take 10 minutes. What you should be striving for is systematic improvement to “reduce the time needed to prepare your documents for translation”. This is actually one of our metrics in our SWOT analysis that we have as one of the tools in our toolkit for our clients.
Want an assessment of your Translation Workflow? Contact us!