General

The General settings offer options that tell SiteSucker what to ignore, what to download, whether existing files should be replaced, and more.

General

The General screen provides the following settings:

Ignore Robot Exclusions

Switch on this control to have SiteSucker ignore robots.txt directives, the Robots META tag, and the X-Robots-Tag HTTP header. See Robot Exclusions for more information about robots.txt, the Robots META tag, and the X-Robots-Tag HTTP header.

Note: SiteSucker always honors robots.txt directives aimed specifically at SiteSucker.

Warning: Ignoring robot exclusions is not recommended. Robot exclusions are usually put in place for a good reason and should be obeyed.

Ignore rel="nofollow"

Switch on this control to have SiteSucker ignore the rel="nofollow" attribute. If the rel attribute equals ”nofollow” in an HTML tag, then a robot like SiteSucker should not follow that link. By default, SiteSucker will not download nofollow links. However, if this switch is on, SiteSucker will download links that have the rel=”nofollow” attribute.

Include Supporting Files

Switch this on to have SiteSucker include all supporting files in the download. When this option is on, SiteSucker will download non-HTML files (such as style sheets, images, etc.) even if they are not allowed by the current URL settings or the Maximum Number of Levels under the Limit settings is exceeded. This setting is useful when downloading sites that link to style sheets, images, or other supporting files that are on separate hosts or subdomains.

Always Download HTML and CSS

Switch on this control to have SiteSucker always download HTML and CSS files despite the File Replacement setting. Use this control to force SiteSucker to download fresh copies of HTML and CSS files.

Download Error Pages

Switch on this control to have SiteSucker download an error page, if available, when an error occurs while downloading a file. If a downloaded error page already exists, SiteSucker will always try to download the file again despite the File Replacement setting. Error pages are never scanned for links to other files. If this control is off, nothing is downloaded when an error occurs.


Connections

Use this control to set the number of simultaneous Internet connections to use when downloading sites.

Login Dialog

Use this control to specify when SiteSucker should display the login dialog for basic HTTP authentication. For more information on authentication and the login dialog, see Password-protected Sites. You can choose from the following options:

  • Never Display - SiteSucker never displays the login dialog. If valid login credentials were recently entered or were found in the Keychain, SiteSucker will use them; otherwise, files that require authentication will be skipped. This option also suppresses display of the Certificate Trust Panel, which is shown when there is a problem with a server’s certificate. If the certificate for a server is invalid and this option is selected, SiteSucker will not display the panel and will not download content from that server.
  • Always Display - SiteSucker always displays the login dialog.
  • Display When Necessary - SiteSucker displays the login dialog unless valid login credentials were recently entered or a single relevant Keychain item was found.

File Replacement

Use this control to specify when SiteSucker should replace existing files. You can choose from the following options:

  • Never - SiteSucker never replaces your local files and only downloads those files that haven’t already been downloaded.
  • Always - SiteSucker always deletes your local files and replaces them with files downloaded from the Internet.
  • With Newer - SiteSucker only replaces existing files if a newer copy is found on the Internet.