The General pane of the Settings dialog provides the following controls:
Ignore Robot Exclusions
Check this box to have SiteSucker ignore robots.txt directives, the Robots META tag, and the X-Robots-Tag HTTP header. See Robot Exclusions for more information about robots.txt, the Robots META tag, and the X-Robots-Tag HTTP header.
Note: SiteSucker always honors robots.txt directives aimed specifically at SiteSucker.
Warning: Ignoring robot exclusions is not recommended. Robot exclusions are usually put in place for a good reason and should be obeyed.
Check this box to have SiteSucker ignore the rel="nofollow" attribute. If the rel attribute equals "nofollow" in an HTML tag, then a robot like SiteSucker should not follow that link. By default, SiteSucker will not download nofollow links. However, if this box is checked, SiteSucker will download links that have the rel="nofollow" attribute.
Ignore Filename in Headers
Check this box to have SiteSucker ignore the filename directive in all HTTP Content-Disposition headers. See File Names for more information about how SiteSucker names downloaded files.
Treat Ambiguous URLs as Folders
Check this box to have SiteSucker treat ambiguous URLs as folders. If a URL does not end with a '/' or a file extension, SiteSucker considers it to be ambiguous. For example, if this option is on and SiteSucker downloads a webpage from
http://www.example.com/directory, the webpage will be saved at
www.example.com/directory/index.html in the destination folder. If this option is off, the webpage will be saved at
www.example.com/directory.html in the destination folder. See File Names for more information about how SiteSucker names downloaded files.
Always Download HTML and CSS
Check this box to have SiteSucker always download HTML and CSS files despite the File Replacement setting. Use this control to force SiteSucker to download fresh copies of HTML and CSS files.
Download Error Pages
Check this box to have SiteSucker download an error page, if available, when an error occurs while downloading a file. If a downloaded error page already exists, SiteSucker will always try to download the file again despite the File Replacement setting. Error pages are never scanned for links to other files.
If this box is not checked, nothing is downloaded when an error occurs.
Ask For Destination
Check this box to have SiteSucker ask you to choose the destination folder when you start a download.
Use this control to specify when SiteSucker should display the login dialog for basic HTTP authentication. For more information on authentication and the login dialog, see Password-protected Sites. You can choose from the following options:
- Never Display - SiteSucker never displays the login dialog. If valid login credentials were recently entered or were found in the Keychain, SiteSucker will use them; otherwise, files that require authentication will be skipped. This option also suppresses display of the Certificate Trust Panel, which is shown when there is a problem with a server's certificate. If the certificate for a server is invalid and this option is selected, SiteSucker will not display the panel and will not download content from that server.
- Always Display - SiteSucker always displays the login dialog.
- Display When Necessary - SiteSucker displays the login dialog unless valid login credentials were recently entered or a single relevant Keychain item was found.
Use this control to specify when SiteSucker should replace existing files. You can choose from the following options:
- Never - SiteSucker never replaces your local files and only downloads those files that haven't already been downloaded.
- Always - SiteSucker always deletes your local files and replaces them with files downloaded from the Internet.
- With Newer - SiteSucker only replaces existing files if a newer copy is found on the Internet.
Note: In some cases, existing HTML and CSS files may be replaced even when the Never setting is used. This may happen, for example, when File Modification is set to None but the existing file has been localized.
Use this control to specify how SiteSucker should modify files after they are downloaded. You can choose from the following options:
- None - SiteSucker does not modify the content of downloaded files, but it may change the file name by adding an appropriate extension.
Localize - SiteSucker will
localizedownloaded files so that you will get the best results when browsing a site offline. This feature modifies the downloaded HTML and CSS documents by replacing every absolute link to a file on a Web server with the corresponding relative link to the local file. If there isn't a local file associated with a link, this setting will ensure that the link points to the file on the Web server.
- Delete After Analysis - SiteSucker deletes HTML and CSS files after they are downloaded and analyzed.
Use this control to limit downloaded files to those at a specific site, those within a specific directory, or those containing a specific path. This option works in conjunction with the Path settings and the Include Supporting Files setting under the Webpage settings. SiteSucker provides the following path constraints:
- None - SiteSucker downloads the file specified in the URL text box and every file that it links to and every site that these files link to, etc. Be aware that this option could result in a HUGE download if allowed to continue forever.
Host - SiteSucker limits the download to those files on the host of the original file being downloaded. For example, if the URL is
http://www.example.com/directory/home.html, this setting limits the download to those URLs beginning with
- Host + 1 - SiteSucker limits the download to those files on the host of the original file being downloaded (just like the Host option), plus one level of files from other domains linked to the original host.
Subdomains - SiteSucker limits the download to those files within the second-level domain and all subdomains of the original file being downloaded. Extending the previous example, this setting will download URLs beginning with
Directory - SiteSucker only downloads those files that are within the directory of the original file being downloaded. Extending the previous example, this setting limits the download to those URLs beginning with
- Path Settings - SiteSucker only downloads the file specified in the URL text box and any files that have paths allowed by the Path settings.
Use this control to select the local destination folder where files will be saved. By default, SiteSucker saves files to the
us.sitesucker.mac.sitesucker folder in the
Downloads folder in the user's home directory. To change the destination folder, choose the Set Destination Folder menu item.