Paths

The Paths tab in the Download Settings dialog lets you specify which paths should be included in or excluded from the download. In these text boxes, enter absolute URLs (that is, URLs beginning with "http://" or "https://") or regular expressions separated by returns.

The paths settings work in conjunction with the Download Option setting under the Options tab according to the following rules:

  1. If this is the original URL (that is, the URL specified in the Web URL text box), then SiteSucker always downloads the file.
  2. Otherwise, if the URL begins with one of the strings (or matches one of the regular expressions) in the Paths to Exclude text box, then the file is not downloaded.
  3. Otherwise, if the URL meets the requirements of the current Download Option setting, then the file is downloaded.
  4. Otherwise, if the URL begins with one of the strings (or matches one of the regular expressions) in the Paths to Include text box, then SiteSucker downloads the file.
  5. Otherwise, SiteSucker does not download the file.

SiteSucker allows you to use regular expressions in the path strings. If the Use Regular Expressions box is checked, all paths are interpreted as regular expressions. For example, to match any URL that contains an underscore, enter the following regular expression: ".*_.*". Expressions are interpreted according to ICU v3 (for details see the ICU User Guide for Regular Expressions). Consult Regular Expressions Reference for additional guidance on regular expressions.