Check your broken links for search spiders

Every search engine (apart from search directories) uses programs called “spiders” to crawl whole websites and grab all the relevant data they can from each page.

The spider will enter the homepage of a website and then follow every link it comes across in the code after saving the content.  It will then crawl each page linked from the homepage and keep repeating the process until it has found every single page on the entire website.

An XML Sitemap is a good way to tell the search spiders exactly where all the pages of content are located which should be linked within your robots.txt file.

The problem with search engine spiders is that they can get stuck or lost.

PHP websites like one’s built in the Magento e-commerce software can re-write urls based on old urls.  One slip in the code could create a never-ending loop of subfolders as shown below:

- /product/ links to /cart/

- /product/cart/ links to /product/

- /product/cart/product/ links to /cart/

- /product/cart/product/cart/ links to /product/ etc. etc. etc.

Also some spiders have difficulties with web programming languages such as JavaScript, Flash or even CSS.  A Flash or JavaScript navigation bar could spell disaster for a search engine crawler, luckily major players like Google, Bing, Yahoo etc. can handle these most of the time.

Broken links can also stop spiders from crawling further into your website, many web spiders have timeout thresholds that if they run into trouble then they won’t spend all day crawling one site.  If a spider keeps finding broken links on your site then it may quit and move onwards, thus only displaying half your site on it’s index.

Your code should be clean and clear for users and search engines but also should be your links. There are free tools to check for broken links on any site, if you have a Windows PC then I suggest you use Xenu link sleuth which was made by a Windows employee and if you have a Mac then try Integrity which does exactly what it says on the tin.



Related Posts

  • Google’s Sandbox for Major Site URL Changes

    Changing a website’s entire URL structure can be both a blessing and a curse, if done incorrectly you can permanently damage a proportion of your overall SEO. You may change the URL structure of your website to improve aspects such as on-site SEO and/or site usability such as product filtering.  This can change nearly all [...]

  • SEO Myth – “NoFollow” Links Do Not Count As Links in Google

    Many people fall in the trap of buying ebooks online claiming to have the secret of SEO success only to find the ebook is years out of date and has irrelevant SEO knowledge. Another thing these ebooks can contain is misinformation such as the “NoFollow” myth. The NoFollow tag was introduced by Google to stop [...]

  • Making external links appear in a new tab or window

    Having links on your website to other related websites in your field makes your website more useful to visitors. Google also favours sites that have useful external links; it’s a semi-large ranking factor which many bloggers seem to miss out as they don’t want to lose traffic. There IS a solution luckily, a perfect balance [...]

  • Why links take time to Mature in Google

    According to trusted SEO sources links can take between 2-6 months to fully mature in Google, but why? The time it takes SEO to take effect can be frustrating but it holds back some very clever techniques to maximise profits from SEO and also people who want to run tests on Google’s algorithm. IF links [...]

  • SEO Paid Links – Are They Worth It?

    We’ve seen many odd SEO techniques that people use to try and “beat” Google in the search results attempting to gain high rankings with sneaky tricks. Some SEO companies use paid links to gain a higher number of linkbacks for their website, Google’s biggest ranking factor. You can pay for links in many ways; the [...]

Leave a Comment