Three main reasons why people mess up the glossary - and how to get it right

In my understanding, the glossary is supposed to have a list of KEY terms that are either industry, client or context specific. With a robust glossary, translations are more consistent, respectful of client terminological preferences and ultimately cut down on both research and review time. In the end, if it is just a list of words, how can people mess it up more often than not? Here is why:

  1. The glossary is typically a pre-facto activity when it really should span the entire translation life cycle. I usually see people building terminology as they go through one or more documents the very first time or using statistical terminology extraction. This results in inadequate term selection as statistically irrelevant terms can be of significance during the translation process and as without proper contextualization, it is challenging to know what is important and even what constitutes a term. Some terms may happen only once in the entire text and may seem obvious, but only when I begin researching, will I notice that there are different industry or client standards for that same term. Other terms may sound technical or complicated but in the end, have widely adopted and accepted terminology.
  2. Less is more. Really. When I talk to people about glossary there seems to be an underlying assumption that the more terms in a glossary, the more complete it is. More terms, in my opinion, increase the probability of irrelevant terms in the glossary fogging the importance of what should be truly emphasized. Numerous terms also make the task of automated QA helping the translator nearly impossible because you begin to get way too many false positives and alerts to watch out for what is important truly.
  3. What constitutes a term? Take for instance "Board of Directors". With proper context, it should be logged as a single term but without it, or through statistical extraction perhaps both the terms "Board" and "Directors" will be flagged separately for the glossary. This results in terms that are mapped but do not necessarily help the translator and can in fact even throw them off base across different languages.

So that is what compromises an excellent glossary. However, how do we produce a good one? The first step is to get all parties involved to agree on premises:

  • the fewer terms the better (they just need to be superbly chosen)
  • mapping out the terms in the source language is just as important or more than the translations assigned to these terms
  • it is a work in progress and will require significant dialogue to achieve expected end results

Once people can agree on these premises, the challenge now takes on a technical dimension: how to ensure that we exchange knowledge efficiently. Emails and spreadsheets can do the trick but are inefficient and do not provide stakeholders with real-time control over terminology. Establishing a shared term base with clear rules for new entries and modifications over existing entries goes a long way ensuring that communication flows smoothly. Each and every feedback also requires proper mapping onto the glossary so that it matures through time and actually reflects stakeholder preferences.

On a personal note, it took me many years to develop a friendly relationship with glossaries. When working with large or imprecise glossaries, I tended to regard them more as a nuisance than a tool that could truly help me. As I developed more context-specific and precise glossaries with my team, I realized that they helped me tremendously during the translation process, allowing me to focus much more on the general flow and style rather than trying to remember and juggle terms. In the end, a glossary is an awesome tool but not enough care and attention goes into handling it with mastery. People spend way too much time trying to enforce concepts: more terms, more checks, more automated QA, and way too little time trying to master the art behind it.