XML Sitemaps – good idea – badly implemented…
Firstly – I think that sitemaps are a good idea, badly executed. It seems that most sitemaps are developed with little thought for the actual usage of them.
Why I think they are a bad idea from Google?
Well, they need to be in an XML format which means that a huge number of less experienced website owners can’t easily produce them except by using one of the many tools that scans the website and produces a snapshot of the pages. If this snapshot is perfect, its pretty much the same as Google sees your site, if its not then it could potentially be worse – typically people upload these without reviewing them at all … duplicated warts and all … Basically there is little point…
For it to work right the page needs to be linked into your CRM and created with time and energy put into getting it right, particularly the 3 optional variables, priority, changefreq and lastmod… these aren’t static numbers and should change as the page changes, some pages will be changed regularly, some won’t, but this in itself can change (imagine a page on the labour conference, won’t change for 10 months, then almost daily for two). The date will need to be changed everytime the page contents changes (NOT the file, the contents, most sites are fed from a database now), and sitemaps shouldn’t contain CSS, JS and all the other files except for those you want indexed (including PDF’s).
An XML file almost seems like the best option, it fits the description – the problem is the that most of the people who create or manage or websites don’t have a clue and so produce websites without any real consideration, a “fire and forget” solution, oddly this seems to include people who are normally quite savvy.
Seriously – the option of just putting up a simple text file of all your pages … something that is easy for anyone to create and manage – more advanced XML files available to those who can.
Still – I do like the canonical tag, but I have already seen a sites implement it wrong enough that it will cause problems for them (such as including any parameters you put into it – doh!)