For this LSP Maturity blog, we look at Translation Quality Evaluation. We will look at evaluation methods that are being used in our industry. How do high performing organizations define quality? What role does quality play in the translation process?
Choosing Quality Evaluation Metrics
Our industry has developed a lot of Quality Evaluation metrics over the years. The translation innovation and automation resource center TAUS writes about translation evaluation metrics regularly. Their 2014 Error Typology Benchmarking Report evaluated 18 traditional error typology methods. This includes the J4250 Metric that we currently use for client review. They conclude that these error type methods fall short on two areas:
- The process and evaluation criteria around the use of these models could be further standardized to meet a limited set of criteria: language, accuracy, terminology and style.
- These methods do not provide a holistic approach to evaluation translation quality, which requires a Dynamic Quality Framework according to TAUS.
This and the ongoing conversation recorded in TAUS’ 2015 and 2016 keynotes motivated us to look at our own translation evaluation methods. When we look at our own client linguistic review process, there two areas of improvement:
- Our evaluation criteria should be reduced to a few basic criteria that are easily understood by client reviewers.
- Our model could be integrated better within the current review process and allow for clients to emphasize about what’s important to them.
However, just offering a model doesn’t necessarily engage clients to think about quality. Quality happens throughout the process. If clients do not get involved until they are at review stage, they are missing out.
How High Performing Clients Define Quality
Many clients start doing quality evaluation when they perform a client review of the deliverable. In any ideal client relationship, you’ll want to define Quality Evaluation criteria ahead of time. You’ll want to manage terminology before you translate and set forth a set of standards for evaluation. In panel discussions, high performing organizations do define all of their content the same. They make distinctions between different types of content. One way that LinkedIn defines content is either conventional or unconventional. This forces them to look at different evaluation criteria for each type of content and what constitute “correctness.” Different clients also look for different ways to incentivize by making goals relevant and actionable.
TAUS recognized the need for organizations to define relevant quality criteria ahead of time. And measuring quality needs to be done consistently in order to make improvements. Many of the quality frameworks that exist are based on a specific set of criteria that apply to specific sets of content. They are not flexible enough to take into account purpose. For instance, a pass/fail on accuracy may be good for technical content, but creative content needs to be engaging and drive results. What TAUS concludes is that quality evaluation needs to be flexible and defined to a very limited set of meaningful criteria that everyone agrees upon ahead of the project.
How the Localization Process drives Quality
Another problem with quality evaluation is the disconnect between outcome and localization effort. High performing clients look at “how” the criteria influences the localization process. When an organization sets out goals, such as participation rates, click-through rates, engagement, how are these criteria affected by the localization process? We can’t do much when the process has happened. One good example is when a client review makes very literal edits in Spanish. The assumption may be that this client doesn’t understand Spanish well enough and mistook our translation to be inaccurate. However, we found out that this client was reviewing for readability. What the client failed to communicate was that their letter was going to be read by employees who read at a 3rd grade level.
The disconnect here is that localization is not a good process to affect readability. Instead, readability writing is done in English (or source language) for a specific grade level by using Health Literacy and Plain Language Writing methodologies. After that, you localize using best practices to keep the readability in other languages at comparable levels.
Productivity in Quality Evaluation
Productivity is an issue in Quality Evaluation because it adds time to a project. TAUS deals with high volume organizations. Their issue is that there is simply not the time and resources to manually evaluate large databases of Machine Translation output. Quality Evaluation on Machine Translation is undergoing an automation revolution. In our world we deal with lower volume and high quality translations. Here we see that the volume typically does not justify thorough quality review.
How do we know we actually meet client requirements? Without any knowledge of these measurements, we will not find out until after we receive feedback from the client. At this moment, we can only rework these findings into a future job. When client review is an ad-hoc process, all we can do at this time is ensure that any changes made are consistent and accurate. Client reviewers typically don’t see the English versions. This creates a misunderstanding as translation can only be evaluated based on how accurately it conveys the English.
Driving Quality Evaluation up the Process
Productivity in Quality Evaluation is to start early before the translation process. Clients first identify a need or a problem. That need drives specific goals. The goals drive the messaging and the design drives the brand and carries the messaging. It’s this information that should drive the translation quality process.
One example is the work we do for Pharmaceutical organization that deals with strict regulations. The Pharma industry often ask for back translations of the translation as a means for evaluation. However, back translations either tells you in English what is being said literally in the translation, or what is intended within the context of the message. What ends up happening is that back translations are taken literally as a way to determine whether translation is accurate. This creates a disconnect between the client reviewer and the translator.
Understanding the Role of Client Review
Knowing the back translation criteria, what value is being put on making sure the messaging is relevant in other languages? Unless the industry can be comfortable with ambiguity, this industry is best served by being more literal rather than making the translation relevant. In this case, the client’s criteria drives the quality process, which in turn should reduce the need for back translation. If you know your client will evaluate the translation on being more literal, then that’s the process by which you translate.
Keep in mind that Client Review is the last step in Quality Evaluation. Also, it doesn’t drive any quality process unless it drives continuous improvement. Those findings need to drive the Quality Evaluation process up the chain. With ongoing translation work, the role of client review may be more of an investment early on. But over time, you are best served by setting up Quality Evaluation standards. Only then can you set up an effective quality control process, where client review focuses on the same criteria that were used to provide the translations. You’ll find that the need for client review may go a lot smoother.
Do you discuss Quality Evaluation with your language service provider? Do you have any feedback or insight on driving the quality process up the chain? If so, let us know. Not working with a language service provider or searching for a new one? Let us find out what your requirements are and objectives and see if we would be a good fit for you. Give us a shout!