Paths

The Paths screen lets you specify which paths should be included in or excluded from the download.

Tapping on "Paths to Include" or "Paths to Exclude" displays a text box where the paths can be specified. Enter absolute URLs (that is, URLs beginning with "http://" or "https://") or regular expressions separated by returns.

The paths settings work in conjunction with the Path Constraint setting in the General screen according to the following rules:

  1. If this is the original URL (that is, the URL specified in the Web URL text box), then SiteSucker downloads the file.
  2. Otherwise, if the URL begins with one of the strings (or matches one of the regular expressions) in the Paths to Exclude text box, then the file is not downloaded.
  3. Otherwise, if the URL meets the requirements of the current Path Constraint setting, then the file is downloaded.
  4. Otherwise, if the URL begins with one of the strings (or matches one of the regular expressions) in the Paths to Include text box, then SiteSucker downloads the file.
  5. Otherwise, if the Include Supporting Files setting is on and the URL references a non-HTML file type, then SiteSucker downloads the file.
  6. Otherwise, SiteSucker does not download the file.

SiteSucker allows you to use regular expressions in the path strings. If the Use Regular Expressions switch is on, all paths are interpreted as regular expressions. For example, to match any URL that contains an underscore, enter the following regular expression: ".*_.*". Expressions are interpreted according to ICU v3 (for details see the ICU User Guide for Regular Expressions). Consult Regular Expressions Reference for additional guidance on regular expressions.