When implementing SEO measures, it is important to be indexed by crawlers.
Crawls don’t happen at the timing you want.
How should we deal with crawlers? Let’s think about the points that promote crawlability.
What is a crawler?
“Crawler” is often heard when doing SEO measures.
A crawler is a program used by search engines to collect data (images and text) from web pages on the Internet, and is sometimes referred to as a “spider.”
Google’s crawler sometimes collects data from websites and SNS links on the internet, and also by following SNS links to understand the existence of other websites, but in addition to waiting for the crawler to visit, You can also declare “I have a new website!” and have the crawler crawl it.
For Google
, “Search Console”
For Bing
, “Webmaster Tools”
Register your website address to be crawled by .

The crawler follows links to each page from the registered address and collects data on each page.
However, just setting a URL does not ensure efficient crawling.
So how do you crawl every corner of your website?
Let’s take a look at how to improve crawlability (ease of crawling) in the next section.

To promote crawlability
-
Utilize sitemap
Although it is basic, there is a way to use “sitemap” to promote crawlability with Search Console and Bing’s Webmaster Tools.
A sitemap is an XML file that can describe the URL of each page on the site, priority, update frequency, and update date and time.
WordPress allows you to add the ability to automatically generate sitemaps with plugins such as All in one seo pack.

Once you register the generated sitemap address in Webmaster Tools, the crawler will start visiting those pages and collecting data.
-
Perform URL canonicalization
As with sitemaps, if the link settings include URLs with various patterns such as those starting with “http”, those starting with “https”, or the presence or absence of “www”, especially the presence or absence of “www” may be mistakenly recognized by the crawler as a link from another server.
I think you use “www” casually, but for example,
Yahoo!’s official page
“https://www.yahoo.co.jp/” and
Yahoo! News
“https://news.yahoo.co.jp/” are now available.
The characters on the left side of the domain, such as “www” and “news,” are called host names, and are used when assigning different servers depending on the role, even within the same domain.

SearchConsole etc. are prompted to register separately depending on the host name, “http” and “https”, because the content is often interpreted as the content of another server.
-
Return correct HTTP status code to crawler
It is preferable that pages that have already been deleted or moved return the correct HTTP status code (410 for deletion, 301 for permanent move, 302 for temporary).
If the correct status code is not returned, it will be recognized as an error, which may reduce crawlability on sites where these occur frequently.
-
Improve navigation and links

Crawlers follow links within a page to visit pages within a site and collect data.
When the crawler patrols and collects data, if there are pages that have many broken links or cannot be accessed because appropriate links have not been set, they will be excluded from the index.
Normally, links are set using the HTML A tag, but when links are dynamically generated using JavaScript, the crawler may not be able to interpret the JavaScript properly and may assume that there is no link.
To be on the safe side, it’s best to continue setting up links and navigation using static HTML with appropriate anchor text.

summary

“Crawlability” refers to how easy it is for crawlers to crawl through a website.
Being crawled and indexed by search engines is the first step in SEO.
If you fail at the basics of being crawled and indexed by search engines, you may reduce the effectiveness of the various SEO measures you take later.
Is it a navigation and hierarchical structure that is easy to crawl without worrying about mass-producing content? It is important to pay attention to this as well.





