word count tools

Website translation projects like any other translation projects are paid by the word. Counting words on a website can be a real pain.

One of my customers told me that they usually have to open a Microsoft Word and cut-and-paste the website into it to get the word count. But this is a very tedious process that might involve pasting a lot of other texts which do not relate to the content that must be translated.

Fortunately, technology has stepped into the industry of website translations services. And today every website translator is able to overcome the problem of how to count words on a website quickly and easily.

Below you will find a list of word count tools that will help you calculate the number of words very efficiently saving you lots of time and effort.

Each tool is a unique and advanced technology the goal of which is to assist you to solve your tasks. What tool to choose is up to you, as it depends on your requirements and your website platform.

1. SDL Trados Studio

A core piece of functionality in any CAT tool, or TEnT is the ability to analyse the material which must be translated and tell you how many words there are (word frequency counter) and how many have already been translated before. SDL Trados Studio is no exception and this information is calculated automatically across more than 120 file types whenever you create a translation project.
A further feature unique to SDL is the use of plugins from the SDL AppStore such as SDL Analyse that can do this analysis quickly and easily without even having to create a translation project, or BaccS that can take the analysis and create customer quotes and invoices automatically from a project analysis.

2. WPML plugin

WPML plugin (the plugin that makes WordPress multilingual) has a built-in word-count tool.
When you click on it, WPML will ask you in what content types you want to count the words. This allows to know the expected costs, according to the kinds of content in your site. For example, you can count only words in “pages” and ignore “blog posts”. You can also count the words in specific pages. The tool will run through the content that you selected, segment it into sentences and count the words. It will ignore all HTML markup, so that the word count is accurate. The tool doesn’t use a translation memory or fuzzy matching. Excellent translation memories will likely reduce the overall word count by another 10%.

3. The Hoth

The Free Word Count Checker Tool from The HOTH, a U.S.-based SEO agency, is an easy and quick way to check the number of words on a page of your site.  Just enter the target URL into the tool, and you’ll be able to see if the word count of that page meets your SEO goals for article length.
It’s important to avoid thin content by making sure you have an appropriate word count for a given page. The HOTH’s tool is the perfect solution and it’s FREE. You can see it here.

4. Wordfast

The use of Wordfast depends on the website structure and back office.  For WordPress websites you can use WPML plug-in to extract translatable content into XLIFF, and then use Wordfast website word count analysis. For other CMS, you can export translatable content in a CSV file. For a simple website, you can mirror it using Site Sucker or other tools. The analysis tool analyzes word counts and segments that are TM matches, repetitions, and internal fuzzy matches. It also returns tag counts for each of these categories. Analysis reports from Wordfast can be exported as PDF, Html, excel or csv.

5. Wordbee

Wordbee Beebox is a middleware solution that connects your CMS, DMS or ECM systems to Wordbee Translator or any other Translation Management System making the translation of content an efficient and user-friendly process. To ensure all stakeholders are aware of the workload before starting the translation project, Beebox is able to calculate the total number of words on a website automatically. This quote can then be passed over to Wordbee Translator for further management, e.g. the TMS owner can accept or decline the job, attach it to an existing project to retrieve specific workflow configurations where translation and many different tasks can take place and finally deliver the translated content back to the client.

6. TextUnited

Text United is a software company that simplifies translation of websites, documentation and software through the means of language technology. It provides a Translation Management System that’s able to count words, segments and units, analyze text and compare portions of new and pre-translated material. That’s why it’s perfect for companies looking to create and maintain multilingual apps, tech docs and websites, removing the chaos from translation and blending the processes of ordering, translation and delivery in one.The center of Text United is the translation memory technology – an engine that helps to decrease translation costs, speed up the process and maintain language consistency through eliminating repetitive translation.

7. OmegaT

You can put a complete website (including folders, images, etc.) in the source folder of OmegaT, and OmegaT will count any translatable text. OmegaT natively recognises (X)HTML files, wiki-based files and SVG images. With the Okapi plugin, it can also load other formats, for instance JSON, YAML or Markdown files. PHP files can be converted with Okapi Rainbow.
OmegaT can translate, and thus count, Typo3 and WordPress exports, as well as Magento e-commerce localisation files.
Last, with OmegaT Team Project features, OmegaT can dynamically download files from any given URL. These files are updated each time the project is refreshed.

8. Lingotek

Lingotek has got connectors for each of the different types of content management systems such as WordPress, Drupal, AEM etc. Once the connector is installed it is possible to upload the website content and get an accurate word count. There are several things you need to take into consideration. They include translation memory matches if you have TM’s as well as other content on the page such as page titles and any meta tags, image alt tags, URL’s, that you want translated as well. Once content is loaded in Lingotek translation management system you can get a report of page word counts including SEO related content.

9. Atril Solutions

The video below demonstrates all the details on how DVX3 works to solve the issue of website word count.

10. Weglot

Weglot is a Translation App to display websites in different languages. They have a WordPress translation plugin and a Shopify translation App but you can use it on any website. Weglot successfully assess and count the number of translated words for more than 50,000+ users.
The tool calculates the total number of words, as the sum of the words in the original language, excluding:
– media (image or video URLs);
– single numbers;
– single punctuation;
– paragraphs and words excluded from the translations, obviously.
So between one translated language to another, the total number of translated words is the same, since the number of words is calculated for the original language.
You can also visit their recently released online tool to calculate website word count.

11. Locize

Locize offers localization solutions to website and software owners. It offes continuous localization lifecycle separating your development and translation process. You can update translations anytime and anywhere. You will not have to be dependant on your development team adding the new translation files to codebase. And our smart translation memory will help you save your time, your budget and increase consistency of your translation project.

12. Website Download tools

You can get the website word count by downloading an entire website. You can use special tools for this purpose. These tools rip entire websites and maintain the same overall structure, and include all relevant media files too (e.g. images, PDFs, style sheets). These tools can be used to copy partial or full websites to your local hard disk so that you can calculate total word count with the help of word count tools like SDL Trados, Wordfast, etc.