WDF*IDF

What does WDF*IDF mean?

WDF*IDF is a formula that can be used to calculate how often a term occurs in relation to your own document and "all available" documents on the Internet. WDF*IDF means "within document frequency" * "inverse document frequency", i.e. the frequency of the term in your own document * the number of all available documents in relation to the number of documents containing the term.

In more detail, the first part of the formula (WDF) delivers the following:

i = word

j = document

L = total number of words in document j

Freq(i,j) = Frequency of word i in document j

Explanation of "+1": if Freq(i,j) = 0, the "+1" causes log2(1) = 0 in the numerator. The result is a percentage that indicates the frequency of the term in relation to all terms in the text.

The second part of the formula (IDF) returns this:

where {\displaystyle N_{D}} $N_{D}$ denotes the number of documents and {\displaystyle f_{t}} $f_{t}$ the number of documents that contain the term {\displaystyle t} $t$ . If the document frequency increases, the fraction becomes smaller.

Multiplying both formulas results in a percentage value that indicates how often the entered term occurs in your own text in relation to all available texts. The higher this value, the more relevant the term is (for the topic).

There are tools that perform this calculation for your desired keywords and display the results in a chart. One paid tool is WDF*IDF from onPage.org. A free alternative is https://www.wdfidf-tool.com/, which, however, only offers 100 queries per hour (for all users together).

How does the WDF*IDF analysis work?

The tools (usually) check the first 10 search results that Google delivers for the keyword that is entered as the analysis term. These pages are the data basis and are used as a ratio generator. The tools now determine the frequency of various terms on all these pages. You can also use your own URL for the check. The tools calculate the frequency in the same way and then compare your URL with the database.

The result is a chart, usually a bar chart. Each bar is assigned to a term and its height corresponds to the WDF*IDF value (chart from the free tool):

If your comparison point (yellow) is above the bar curve, it is above the so-called spam line. You should avoid this because search engines might suspect "keyword stuffing" on your site or blog.

The table below provides more precise figures:

You can see that the terms "anchor text", "anchor texts" and "link text" are relevant for the topic "anchor text", whereas "press portals or vserver" are not.

The tool also shows the"competition", i.e. the database:

It is worth taking a closer look at these as"best practice" examples and rebuilding them.

How can I optimize my site with WDF*IDF?

Use this analysis primarily to see whether your terms are above the spam line. If this is the case, you should reduce these terms by omitting them or replacing them with synonyms. Secondly, the analysis is very useful for keyword research. Especially if you want to write a detailed text, you should check whether you cover all aspects (i.e. the most important terms listed here). For example, an aspect for anchor text is also link building or backlink, or optimization or context.

Thirdly, the analysis serves to check whether you are using terms that are already in use frequently enough. The database consists of the top results. ("They must be doing it right") So it is also worth looking at what the "others" are doing better than you. If you're not quite happy with the usual WDF*IDF tools such as those from Ryte, the convenient "Sistrix Content Assistant" may be an alternative. However, the Sistrix modules are associated with considerable costs.

This video was uploaded by the creator of the free WDF*IDF tool and is intended to give you a broader overview of the tool:

You might also be interested in

Technical SEO

Canonical Tag

Definition: The canonical tag is a link element in the header of a page. It informs search engines where the original content is located (i.e. the URL). Only this should be indexed by the search engine. Several versions are created, for example, on dynamic websites when content is filtered. As all...

Technical SEO

robots.txt

What does a robots.txt do? As the file extension already indicates, the robots.txt file is a human-readable text file. The purpose of robots.txt is to inform search engines such as Google or Bing that selected pages of a website may not be included in the search engine index. The technical details...

Technical SEO

Anchor text

Definition: Anchor text is the German equivalent of anchor text. Other synonyms are link text and reference text. Below you will learn what you can use anchor text for, how it is technically structured and how it is used for SEO.

Technical SEO

Meta description

Definition: The meta description is a short description of the content of the page. It is used to provide search engine users with more information about the search result.

WDF*IDF

How does the WDF*IDF analysis work?

How can I optimize my site with WDF*IDF?

Further technical terms

Technisches SEO

Other technical terms

From our blog

Vacancies

Employment agency

How does the WDF*IDF analysis work?

How can I optimize my site with WDF*IDF?

Tag cloud

You might also be interested in

Further technical terms

Technisches SEO

Other technical terms

From our blog

Vacancies

Employment agency