Search Engines 102 – Part #2, Improving Spiderability
by Alli Summerford
January 1, 2006
In Part #1 of this article, I talked about content development, page titles and their enormous impact on a site’s placement and efficacy in the search engines. In this follow up article, I will step back and discuss issues that relate to the indexability or spiderability of your site. Now that you have all that great content in place, it is time to make sure that the search engines can find it.
What is spiderability and why is it so important?
Let’s start with a definition of spiderability (synonymous with indexability).
Spiderability: A measure of how well a site can be indexed (or crawled) – by a search engine (SE) spider (the software program that SEs use to gather information off the web).
Whether or not (and how well) your site is spiderable will have a direct impact on your site’s search engine rankings, making this factor one of the most important in search engine optimization. A site must be built in a way that allows the search engine spiders to crawl the entire site, gathering all of the site’s relevant information/content, not just that of the home page.
What can I do to improve my site’s spiderability?
Here is a list of the top five issues related to spiderability. You can check these against your site to determine areas of potential improvement. In addition, I offer solutions for each spiderability problem just under its explanation.
1. Frames – A site built using frames has a distinct disadvantage in the search engines. If not coded correctly, the SE spiders may not be able to find your keyword-rich pages of content.
Solution: Do not design or build your site using frames. If your site is already built using frames, consider a re-build or make sure that the proper work-arounds are in place to ensure spiderability.
2. Image-Based Navigation – Ask your web developer if your site has image-based navigation. SE Spiders can not follow these links and are therefore stopped from ever getting any farther than the home page if there is no alternate method for them to follow. (Solution shown under #3 below)
3. JavaScript-Based Navigation – As with #2 above, you will need to ask your web developer if your site’s navigation uses JavaScript. As with image-based links, the SEs will not follow the JavaScript links, providing a barrier to the SE spiders.
Solution: A simple set of text links along the bottom of your site’s pages will correct both problems #2 and #3. See any page of my site for an example of this work-around in place. The SE spiders can follow these text links, allowing them to index deeper into the site’s content.
4. CSS and JavaScript code embedded in the page – SE spiders are only interested in the visible body text of your web pages. They are not interested in the style sheets or JavaScript. Often, these sets of code appear in the <HEAD> section of the site’s code, increasing download time and pushing the keyword-rich content down further in the code.
Solution: JavaScript and CSS code should not be embedded on each page, but instead put into an external file that is simply referenced in each page of the site. Doing so has several advantages, but one of the most compelling is that your site’s keywords and content all move up, up, up in the code, signaling to the search engines their importance and boosting your site’s relevancy ratings.
5. Lack of Site Map – A site map serves not only to aid the site user in finding the needed information as quickly as possible, but it ensures that the SE spider is able to follow text links to all the main pages of your site.
Solution: Create a complete site map and include a text link to it along the bottom of every page in your site.
Coming Next: Part #3 will discuss Google’s new program, Google Sitemaps, and how you can use it to your advantage.