Internationalization and Localization of Open Source Clinical Research Tools

Tom HickersonClinovo, the company that created ClinCapture, an open-source Electronic Data Capture (EDC) platform for clients in the pharma and life sciences space, recently localized ClinCapture to Russian for Synergy Research Group (SRG). SRG, the fastest growing Clinical Research Organization (CRO) in Russia and Eastern Europe, is the latest company to join Clinovo's CRO partners group. I thought I would share some of Clinovo's best practices for language localization.

We were deeply honored that ClinCapture was being presented by SRG in Moscow during the Clinical Trials Russia Conference, which was a big success. It’s always a real treat to work with a partner in the field, especially one that is working in an international space such as Russia, Central Asia, and Eastern Europe.

Synergy is one of several partners that we’ve worked with this fall as part of our CRO Partnership Program, which gives us the opportunity to collaborate with CROs in the US and international fields. This is an exciting opportunity, as it allows us to collaborate with an entire ecosystem of global partners, and quickly take their feedback into account to keep improving ClinCapture.

Why internationalize EDC?  As clinical trials reach out into new and developing countries, the level of English - typically your standard business language in the West - starts to drop off.  Also, large markets like China and Russia have many scientifically and medically trained professionals, who may not have the necessary English skills to operate English-only copies of medical software. 

The challenge of Internationalization and Localization (often shortened to I18N and L10N because they are eight een and ten letters from the first to the last letter) in any software application is always a daunting task, especially in a project that has over ten years’ worth of development history. It’s worth pointing out that I18N refers to creating a system that has the potential of being translated to multiple languages, while L10N refers to the process of adapting a system to a specific language, which has already been internationalized. In our case, ClinCapture has been through multiple rounds of I18N, and the last

round of L10N is taking place in Russian.

I’d like to write a little bit about how far we’ve come since the first internationalization project for ClinCapture. My first take-away to our readers: When you develop a feature, you need to assess from the start if this feature will need I18N/L10N, versus a feature to be used only in English, and implement with that requirement in mind as you work with the development roadmap.

I originally helped implement the first I18N changes in 2007 to OpenClinica, the open-source predecessor to ClinCapture, by implementing changes from an outside company called BAP Health in Spain. They were one of our first outside collaborators, and the I18N project was completely driven by them as an open-source, community effort. It opened up the ability for us to apply our solutions to a number of European and Asian organizations and agencies, and was a great step forward in making the project more popular to its growing community.

Sharing the same open source code, ClinCapture has always been available in a number of languages since that time. However, we’ve grown quite a bit in the last two years, and the code and feature set for ClinCapture is much different now. There are certain challenges that you have to keep in mind as you grow an application and develop new multi-language features. They include the following:

  • Alphabets, and their encoding. ClinCapture has always captured data in a character encoding that allows for foreign alphabets, called Unicode. (For more on the subject and how this is a complex issue, you can read the landmark article by Joel Spolsky for developers here.) Now, you may get the impression that this is not really an issue, until you start to work with text files yourself. Editing files in Notepad, for example, will create one kind of encoding for your computer, while editing files in Microsoft Word will create something entirely different kind of encoding.
  • Dates. In working with dates, you quickly find out that Czech dates, French dates, Russian dates can all differ on the web page, but all have to be cleanly read and saved into the database in the same way. We’ve worked with many calendar widgets in the past, but are now upgrading the ‘date picker’ for all our dates to the latest standard in our upcoming releases. You can see more about the work we’re doing with that here.
  • Making sure your code is not affected by non-English characters. ‘Code’ exists at many levels, especially when we are talking about a rich web application like ClinCapture, which has Javascript running on the browser side and Java running on the server side. It’s especially important to review the Javascript and make sure that a word in Russian won’t accidentally break code, for example, where we were only expecting words in English.
  • Making sure settings are accurately reflected in the application. Up until now, ClinCapture kept track of its user’s language settings by looking at the browser setting. Our experience has been that sometimes, different browsers interpret the default language differently. Also, the language of the operating system may affect the browser’s language setting as well. To counter this unpredictability, ClinCapture will now ask for the user to define the language specifically. You can see more about the work we’re doing with that here.

The Developer Team and Testing Team are working very hard to make sure ClinCapture is the best not only in English, but in any language. There are always a number of issues with making sure any application is fully compliant and fully internationalized. Our engagement with partners and clients like Synergy Research gives us a short, positive feedback loop within which we can quickly find and fix issues with multiple language support.

What do you think about our technical feature set and roadmap for ClinCapture in the future? You are always welcome to sign up for a community account and contribute to the discussion here at our community development site. Also, if you are an international user or CRO and would like to see ClinCapture in your local language, you can contact us about collaborating with us in our next localization efforts.

Comments

Hello, Tom. I would like to

Hello, Tom. I would like to recommend to you and your readers a tool, if you're involved in localization projects. It's very useful in the management of collaborative software translation projects. It's https://poeditor.com/

regards