BJC Computer Services
 
  HOME AD Import Scripts | Sitemap Creator 
 

Sitemap XML Creator
for Article Dashboard

 

Setup and Configuration
Options

 

 

Description:

The Sitemap XML Creator for Article Dashboard will scan your local directory for files, and can dynamically create URL's for Articles, Categories, and RSS Feeds to include in the sitemap.xml file. The script can also be optioned to automatically create sitemap index files for sites that contain greater than 50,000 URL's.

Note:  This script does NOT create a sitemap.html file


Installation:

  • Extract the ADSitemapCreator script file from the .zip archive
  • Open the script file in a text editor (windows notepad)
  • Set the required user options from the options list below
  • Save the updated script file
  • Upload the script file to your article dashboard root directory
  • Execute the script via web browser or CRON

User Options

The following user options are available in the Sitemap XML Creator for Article Dashboard. Each user configuration option is explained below.

Database access parameters are retrieved from the Article Dashboard 'setup.php' file in the root file directory. You should first verify the file exists, and that it contains accurate database access instructions before proceeding. The script file will not execute without the Article Dashboard setup.php file.

** Note:  Default script user options will work fine for most users **

To DISABLE any option - Set value = false;  To ENABLE - Set value = true;


Option - URL Page Limit

This option is provided to limit the number of URL entries per sitemap file (for large sites that exceed 50,000 total URL's) and also to maintain an individual  Sitemap file size of less than 10mb. 

Using this option as reference, the script will split large sites (greater than 50K url's) into multiple sitemap files, and will also include a sitemap index file. This option cannot be set to a value greater than the maximum allowable 50,000 URL's. BJC recommends the default value of 40000.

  • Example:   $URL_Page_Limit = 40000;

Option - URL Limiter

When set to TRUE, this option will limit the total number of URL's contained within the sitemap XML file. This option is provided for users who wish to limit the number of URL's within the sitemap XML file.

Note: Enabling this option is useful for testing the sitemap XML creator script without generating a full sitemap.

  • Enable:   $URL_Limiter = true;
  • Disable:  $URL_Limiter = false; 

Option - URL Limiter Value

When the URL Limiter is Enabled, this option will control the number of URL's listed. Note that (when options are enabled) local file URL's are processed first, followed by Articles, then Categories, then RSS Feeds.

  • Example:   $URL_Limiter_Value = 1000;

Option - Sitemap Index Name

Sitemap XML files are limited to 50000 URL's and a file size of less than 10MB. When the limits are exceeded, a Sitemap INDEX is required. This configuration will set the file name to use when a sitemap index file is required. You can change this name to whatever you would like to use.

  • Example:  $Sitemap_Index_Name = "sitemap-index.xml";

Option - Sitemap File Name 

When a sitemap index file is not required, this is the file name used to create the individual sitemap XML file.

  • Example:   $Sitemap_File_Name = "sitemap.xml";

Option - GZ Compress

Users have the option to Enable GZ compression for sitemap files. GZ compression will make the file size smaller, and conserve web site bandwidth during large sitemap file transfers by search engines.

  • Enable:    $gz_compress = TRUE;
  • Disable:   $gz_compress = FALSE;

Option - GZ Delete Original

When the GZ compress option is Enabled for sitemap files, users also have the option to delete the original XML files, or keep them in place.

  • Enable:    $gz_delete_original = TRUE;
  • Disable:   $gz_delete_original = FALSE;

Option - Include Local Files

When Enabled, the Sitemap XML Creator script will scan your local root directory and include local files into the sitemap. The script will not include files which have been "Disallowed" in your robots.txt file.

Note:  This script DOES NOT crawl sub-directories.

  • Enable:    $include_local_files = TRUE;
  • Disable:   $include_local_files = FALSE;

Option - Include Articles 

The script will scan your article dashboard database, and create sitemap URL entries for all approved articles.  Enable this option to include the article list. When Disabled, the script will bypass the approved article list.

  • Enable:   $include_articles = TRUE;
  • Disable:  $include_articles = FALSE;

Option - Include Categories

The script will scan your article dashboard database, and create sitemap URL entries for all categories.  Enable this option to include the category list. When Disabled, the script will bypass the category list.

  • Enable:   $include_categories = TRUE;
  • Disable:  $include_categories = FALSE;

Option - Include RSS Feeds

The script will scan your article dashboard database, and create sitemap URL entries for all RSS Feeds.  Enable this option to include the RSS Feed list. When Disabled, the script will bypass the RSS Feed list.

  • Enable:   $include_rss_feeds = TRUE;
  • Disable:  $include_rss_feeds = FALSE;

Option - Priority 

The Priority options determine the URL priority for search engine use. Assign reasonable priorities from 0.0 (lowest) to 1.0 (highest) to your pages.  Setting these options properly can assist the search engines by allowing them to identify the highest priority URL's for indexing.

Note: It is NOT recommended to set ALL URL's to Priority 1.0.

Examples:

  • $localfile_priority = "1.0";
  • $article_priority = "0.5";
  • $category_priority = "0.5";
  • $rssfeed_priority = "0.5";

Option - Change Frequency 

The Change Frequency options indicate how often the URL changes. This seems to be an educated guess, or a hint to the search engine crawler. For example, use "Always" if the URL content changes on each page access. Use "Never" for archived pages that do not change.

Examples:

  • $localfile_changefreq = "Daily";
  • $article_changefreq = "Daily";
  • $category_changefreq = "Daily";
  • $rssfeed_changefreq = "Daily";

Possible values:
"always", "hourly", "daily", "weekly", "monthly", "yearly", "never"


Option - Ping Google

This option is provided to notify Google (via ping) automatically when the sitemap file has been updated.

  • Enable:     $ping_google = TRUE;
  • Disable:    $ping_google = FALSE;

Option - Local File Includes

Files that are located in the Root directory will be imported automatically, assuming that the Include Local Files option has been Enabled.

This option ( Local File Includes ), is provided to allow the ability to add other local files to the sitemap, that may be located in sub-directories.

Sample Array:

          $local_file_includes = array (
               "/path/to/include/file_name1.php",
               "/path/to/include/file_name2.php",
               "/path/to/include/file_name3.php"
                );

Note that the path is relative to the local URL, and is not the server path:
Example: http://www.yourdomain.com/path/to/file_name1.php


Option - Local File Excludes

This option array is provided to EXCLUDE local files from the sitemap. The script will respect your local robots.txt and will automatically exclude any files that have the "Disallow" flag set within the robots.txt file.

Use this array to exclude files which ARE NOT already excluded by robots.txt

Sample Array:

          $local_file_excludes = array (
               "/path/to/exclude/file_name1.php",
               "/path/to/exclude/file_name2.php",
               "/path/to/exclude/file_name3.php"
                );

Note that the path is relative to the local URL, and is not the server path:
Example: http://www.yourdomain.com/path/to/file_name1.php


 


Disclaimer: Test this script thoroughly under your unique server configuration before using in a live traffic environment. For questions, assistance, or suggestions: Email: support@bjc-computer-services.com

Important:  For script security, the following entry should be added to your robots.txt file in order to exclude the import script file from search engine indexing.

     User-agent: *
     Disallow: /YourScriptName.php


 

Return to the Article Dashboard PHP Scripts page...