Creating a Dynamic Sitemap Using Coldfusion

Published: 05-Mar-2011
Author: Steven Neiland
Site Url: http://www.neiland.net/article/creating-a-dynamic-sitemap-using-coldfusion/

Creating an accurate sitemap is a necessity for seo. In addition it serves as an important usability feature for your website. However building a sitemap manually can be a tedious job, and if you have a dynamic site (such as a blog) it can be even harder to keep your sitemap accurate.

Fortunately by using url rewriting and coldfusion we can create a dynamic site map which is built directly by our database on loading. Further we can apply this technique to styling the sitemap using xsl. Here's how.

Before creating a sitemap dynamically let first create one manually to detail the structure.

Step 1: Understanding the XML Sitemap Structure

A standard xml file consists of the follow components.

  1. XML declaration: This is a standard declaration for a xml file.
  2. URLset: This serves as a wrapper for the url nodes.
  3. URL Nodes: A url node is a declaration for a specific page address on the site and consists of the following 4 child nodes.
    • loc: The actual web address of a particular page
    • lastmod: The data the page was last modified in the format "yyyy-mm-dd"
    • changefreq: (optional) How often the page is updated always(changes each view),hourly,daily,weekly,monthly,yearly or never.
    • priority: (optional) Priority you assign this page relative to the rest of the site. A value in the range 0.0 - 1.1 (default 0.5).
  4. As you can see it really is not that complicated to create a sitemap. Here is a simple sitemap listing the home page and a blog page.

    <?xml version="1.0" encoding="UTF-8"?>
    <urlset
    xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"
    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    xsi:schemaLocation="http://www.sitemaps.org/schemas/sitemap/0.9
    http://www.sitemaps.org/schemas/sitemap/0.9/sitemap.xsd">
    <url>
    <loc>http://www.neiland.net/</loc>
          <lastmod>2011-03-05</lastmod>
          <changefreq>daily</changefreq>
          <priority>1</priority>
    </url>
    <url>
    <loc>http://www.neiland.net/blog/</loc>
          <lastmod>2011-03-05</lastmod>
          <changefreq>weekly</changefreq>
          <priority>0.8</priority>
    </url>
    </urlset>

    Step 2: Create a Sitemap.cfm Page

    In order to build a dynamic sitemap that replicates the above xml file we simply rename sitemap.xml to sitemap.cfm and add the follow two processing directives to the head of the file. These tell coldfusion to serve the file using the utf-8 encoding standard and to only output code that is wrapped in the cfoutput tags. We finally wrap our previous code with cfoutput tags so that the contents of the file are displayed.

    Note: It is very important that you leave no spaces between the opening cfoutput tag and the xml declaration as any whitespace before the xml declaration invalidates the file.

    <cfsetting enablecfoutputonly="true" showdebugoutput="false">
    <cfprocessingdirective pageencoding="utf-8">
    <cfoutput><?xml version="1.0" encoding="UTF-8"?>
    <urlset
    xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"
    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    xsi:schemaLocation="http://www.sitemaps.org/schemas/sitemap/0.9
    http://www.sitemaps.org/schemas/sitemap/0.9/sitemap.xsd">
    <url>
    <loc>http://www.neiland.net/</loc>
          <lastmod>2011-03-05</lastmod>
          <changefreq>daily</changefreq>
          <priority>1</priority>
    </url>
    <url>
    <loc>http://www.neiland.net/blog/</loc>
          <lastmod>2011-03-05</lastmod>
          <changefreq>weekly</changefreq>
          <priority>0.8</priority>
    </url>
    </urlset>
    </cfoutput>

    We test that this is working by visiting "http://oursite/sitemap.cfm/" .If we get back an xml document then we are good to continue on to the next step.

    Step 3: Add Dynamic Content

    Now that the sitemap is being served by coldfusion we can add to it dynamically. Here I will demonstrate how I add some blog entries to the listing.

    <cfsetting enablecfoutputonly="true" showdebugoutput="false">
    <cfprocessingdirective pageencoding="utf-8">
    <cfquery name="demo" datasource="mydatasource">
    select
    url, dateupdated
    from blog
    limit 5
    </cfquery>
    <cfoutput><?xml version="1.0" encoding="UTF-8"?>
    <urlset
    xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"
    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    xsi:schemaLocation="http://www.sitemaps.org/schemas/sitemap/0.9
    http://www.sitemaps.org/schemas/sitemap/0.9/sitemap.xsd">
    <url>
    <loc>http://www.neiland.net/</loc>
          <lastmod>2011-03-05</lastmod>
          <changefreq>daily</changefreq>
          <priority>1</priority>
    </url>
    <url>
    <loc>http://www.neiland.net/blog/</loc>
          <lastmod>2011-03-05</lastmod>
          <changefreq>weekly</changefreq>
          <priority>0.8</priority>
    </url>
    <cfloop query="demo">
    <url>
    <loc>http://www.neiland.net/blog/#demo.url#/</loc>
           <lastmod>#dateFormat(demo.dateupdated,"yyyy-mm-dd")#</lastmod>
    <changefreq>never</changefreq>
           <priority>0.5</priority>
    </url>
    </cfloop>
    </urlset>
    </cfoutput>

    As you can see its a simple matter of getting some data and outputting it. Note the use of dateformat to output a valid date for the lastmod child node. Finally there is one last step we can do.

    Step 4: Create a rewrite rule

    At this stage if you go to http://yourdomain/sitemap.cfm you should see an xml file listing your url nodes. You could submit this to search engines as is but I prefer to hide that the file is dynamic by using this apache rewrite rule. (IIS has a similar url rewrite capability)

    #Sitemap
    RewriteRule ^sitemap.xml/?$ sitemap.cfm [L,NC]

    So thats all there is to it. Now whenever googlebot crawls your site it will find an accurate up to date sitemap.xml file.