General

The General screen provides the following settings:

Suppress Login Dialog

Whenever SiteSucker encounters a page that requires authentication, it displays the Login Dialog.

Switch this on to suppress display of the Login Screen and skip the download of any pages that require authentication. For more information on authentication, see Password-protected Sites.

Ignore Robot Exclusions

Switch this on to have SiteSucker ignore robots.txt exclusions and the Robots META tag.

Warning: Ignoring robot exclusions is not recommended. Robot exclusions are usually put in place for a good reason and should be obeyed.

By default, SiteSucker honors robots.txt exclusions and the Robots META tag. The robots.txt file allows the Web site administrator to define what parts of a site are off-limits to specific robots, like SiteSucker. Web administrators can disallow access to cgi and private and temporary directories, for example, because they do not want pages in those areas downloaded. In addition to server-wide robot control using robots.txt, Web page creators can also use the Robots META tag to specify that the links on a page should not be followed by robots.

Replace Files

Use this control to specify when SiteSucker should replace existing files. You can choose from the following options:

  • Never - SiteSucker never replaces your local files and only downloads those files that haven't already been downloaded.
  • Always - SiteSucker always deletes your local files and replaces them with files downloaded from the Internet.
  • With Newer - SiteSucker only replaces existing files if a newer copy is found on the Internet.

Note: SiteSucker will always replace existing HTML and CSS files regardless of the Replace Files setting.

Path Constraint

Use this control to limit downloaded files to those at a specific site, within a specific directory, or containing a specific path. This option works in conjunction with the Paths settings. SiteSucker provides the following path constraints:

  • None - SiteSucker downloads the file specified in the Web URL text box and every file that it links to and every site that these files link to, etc. Be aware that this option could result in a HUGE download if allowed to continue forever.
  • Subdomains - SiteSucker limits the download to those files within the second-level domain and all subdomains of the original file being downloaded.
  • Host - SiteSucker limits the download to those files on the host of the original file being downloaded.
  • Directory - SiteSucker only downloads those files that are within the directory of the original file being downloaded.
  • Paths Settings - SiteSucker downloads the file specified in the Web URL text box and any files referenced which have paths included in the Paths settings.