By:Phil Craven, Published:2003-08-7

Tips

Domain names and Filenames
To a spider, www.domain.com/, domain.com/, www.domain.com/index.html and domain.com/index.html are different urls and, therefore, different pages. Surfers arrive at the site’s home page whichever of the urls are used, but spiders see them as individual urls, and it makes a difference when working out the PageRank. It is better to standardize the url you use for the site’s home page. Otherwise each url can end up with a different PageRank, whereas all of it should have gone to just one url.

If you think about it, how can a spider know the filename of the page that it gets back when requesting www.domain.com/ ? It can’t. The filename could be index.html, index.htm, index.php, default.html, etc. The spider doesn’t know. If you link to index.html within the site, the spider could compare the 2 pages but that seems unlikely. So they are 2 urls and each receives PageRank from inbound links. Standardizing the home page’s url ensures that the Pagerank it is due isn’t shared with ghost urls.

Example: Click here to go to a holiday accommodation website. Notice that the url in the browser’s address bar contains “www.”. If you have the Google Toolbar installed, you will see that the page has PR6. Now remove the “www.” part of the url and get the page again. This time it has PR4, and yet they are the same page. Actually, the PageRank is for the unseen frameset page. The linkages within the site and to the site have used different urls for the same page, and split the PageRank between them. That’s not the best way to do it.

Imagine the page, www.domain.com/index.html. The index page contains links to several relative urls; e.g. products.html and details.html. The spider sees those urls as www.domain.com/products.html and www.domain.com/details.html. Now let’s add an absolute url for another page, only this time we’ll leave out the “www.” part – domain.com/anotherpage.html. This page links back to the index.html page, so the spider sees the index pages as domain.com/index.html. Although it’s the same index page as the first one, to a spider, it is a different page because it’s on a different domain. Now look what happens. Each of the relative urls on the index page is also different because it belongs to the domain.com/ domain. Consequently, the link stucture is wasting a site’s potential PageRank by spreading it between ghost pages.

Adding new pages
There is a possible negative effect of adding new pages. Take a perfectly normal site. It has some inbound links from other sites and its pages have some PageRank. Then a new page is added to the site and is linked to from one or more of the existing pages. The new page will, of course, aquire PageRank from the site’s existing pages. The effect is that, whilst the total PageRank in the site is increased, one or more of the existing pages will suffer a PageRank loss due to the new page making gains. Up to a point, the more new pages that are added, the greater is the loss to the existing pages. With large sites, this effect is unlikely to be noticed but, with smaller ones, it probably would.

So, although adding new pages does increase the total PageRank within the site, some of the site’s pages will loseWeight Exercise PageRank as a result. The answer is to link new pages is such a way within the site that the important pages don’t suffer, or add sufficient new pages to make up for the effect (that can sometimes mean adding a large number of new pages), or better still, get some more inbound links.

mulberry sale spyder womens jacket cheap new balance 574 mulberry outlet cheap new balance 574 arcteryx outlet mulberry sale spyder womens jacket mulberry sale spyder womens jacket mulberry outlet mulberry outlet new balance 574

Popular Articles

Top 10 Commentators


Subscribe to this feed! Subscribe by Email!

Random Bits Podcast

You need to download the Flash player from Adobe

Blogs Worth Reading