| |
New
Standard: Search Engine Giants Adopt the XML Protocol
In 2005, the search engine Google launched the Sitemap
0.84 Protocol, which would be using the XML format.
A sitemap is a way of organizing a website, identifying
the URLs and the data under each section. Previously, the
sitemaps were primarily geared for the users of the website.
However, Google's XML format was designed for the search
engines, allowing them to find the data faster and more
efficiently.
Google's new sitemap protocol was developed in response
to the increasing size and complexity of websites. Business
websites often contained hundreds of products in their
catalogues; while the popularity of blogging led to
webmasters updating their material at least once a day, not
to mention popular community-building tools like forums and
message boards. As websites got bigger and bigger, it was
difficult for search engines to keep track of all this
material, sometimes "skipping" information as it crawled
through these rapidly changing pages.
Through the XML protocol, search engines could track the
URLs more efficiently, optimizing their search by placing
all the information in one page. XML also summarizes how
frequently a particular website is updated, and records the
last time any changes were made.
XML sitemaps were not, as some people thought, a tool for
search engine optimization. It does not affect ranking, but
it does allow search engines to make more accurate rankings
and searches. It does this by providing the data that a
search engine needs, and putting it one place-quite handy,
given that there are millions of websites to plough through.
The Sitemaps protocol allows a webmaster to inform
Google about URLs on a website that are available for crawling. A
Sitemap is an XML file that lists the URLs for a site. It allows
webmasters to include additional information about each URL: when it
was last updated, how often it changes, and how important it is in
relation to other URLs in the site. This allows search engines to
crawl the site more intelligently. Sitemaps are a URL inclusion
protocol and complement robots.txt, a URL exclusion protocol.
Sitemaps are particularly beneficial on websites where:
- some areas of the website are not available through the
browsable interface, or
- webmasters use rich Ajax or Flash content that is not
normally processed by search engines.
The webmaster can generate a Sitemap containing all accessible
URLs on the site and submit it to search engines. Since Google, MSN,
Yahoo, and Ask use the same protocol now, having a Sitemap would let
the biggest search engines have the updated pages information.
Sitemaps supplement and do not replace the existing crawl-based
mechanisms that search engines already use to discover URLs. By
submitting Sitemaps to a search engine, a webmaster is only helping
that engine's crawlers to do a better job of crawling their site(s).
Using this protocol does not guarantee that web pages will be
included in search indexes, nor does it influence the way that pages
are ranked in search results.
To help Google and other search engines fine the pages deep within
your site, we can create and submit a sitemap in the correct format
they require. This is not a simple page of WebPages, but a file of
detailed code and information that we have automatically updated
when ever time a new page is added or changed on your website..
Here is the brief summary
of its features:
- It generates any kind of sitemap you require: XML, Text,
HTML site maps.
- It is developed in PHP languages and works with most
web-server's configurations
- Flexible configuration that allows you to set any sitemap
parameters and crawler settings
- Support of LARGE websites, dividing the sitemap on the parts
per 50,000 URLs each and Sitemap Index file according to the
Google sitemap protocol.
- "robots.txt" exclusion protocol is supported
- GZip compression supported (optional)
- User-friendly progress indication for sitemap generation in
manual mode
- Allows to setup the cron job to create sitemaps without a
user interaction
- Informs (ping) Google automatically when sitemap generation
is complete
- Website structure analysis feature represents the tree-like
layout of your pages distribution within folders
- The script collects all generated sitemaps details and
provides the log changes in it, including the number (and the
lists) of added and removed pages.
- The broken links are detected by the application and
reported on the special page, providing also the URLs pages that
refer to these bad links.

|