Path

Path

The Path screen lets you specify which paths should be included in or excluded from the download. It also provides a way to programmatically alter the names or paths of downloaded files.


Paths to Include or Exclude

PathsToExclude

The Paths to Include and Paths to Exclude settings work in conjunction with the Path Constraint setting under the General settings and the Include Supporting Files setting under the Webpage settings according to the following rules:

  1. If this is the original URL (that is, the URL specified in the URL text box), then the file is downloaded.
  2. Otherwise, if the URL begins with one of the strings (or matches one of the regular expressions) in the Paths to Exclude list, then the file is not downloaded.
  3. Otherwise, if the URL meets the requirements of the current Path Constraint setting, then the file is downloaded.
  4. Otherwise, if the URL begins with one of the strings (or matches one of the regular expressions) in the Paths to Include list, then the file is downloaded.
  5. Otherwise, if the Include Supporting Files setting is on and the URL references a non-HTML file type, then the file is downloaded.
  6. Otherwise, the file is not downloaded.

In these lists, enter absolute URLs (that is, URLs beginning with http:// or https://) or regular expression patterns. URLs should be entered as they appear in the Safari address and search field, i.e., without encoding except for characters from the ISO-8859-1 extended character set and spaces (which are encoded as %20).

When using regular expressions, the pattern must match the entire URL. For example, to match any URL that contains an underscore, enter the .*_.* regular expression. The pattern syntax currently supported is that specified by ICU, which is described at Regular Expressions - ICU User Guide.

If you tap the Edit button in the Paths to Include or Paths to Exclude screen, SiteSucker displays a toolbar with the following buttons:

Delete Delete

Deletes the selected paths.

Edit Edit

Allows you to edit the selected path.

Add Add

Allows you to add a new path. Turn on the Regular Expression button in the editor when adding a regular expression.


Paths to Replace

The Paths to Replace setting allows you to use regular expressions to replace the normal path or name of a downloaded file with a different path or name. See File Names for more information about how SiteSucker names downloaded files.

PathsToReplace

Enter a search pattern and a substitution template for each path you would like to replace. If the search pattern matches a file's path, the path will be altered in accordance with the substitution template. The search pattern must match the entire path relative to the destination folder. The template specifies what should be used to replace each match, with the backreference $0 representing the entire path, $1 representing the contents of the first capture group, and so on. Search patterns are applied in the order in which they appear in the list, and the order of search patterns can be rearranged by dragging them in the list when editing. A path that matches multiple search patterns may be modified more than once.

For example, in the image shown above, SiteSucker is instructed to do the following:

  1. move a site's graphics folder to the root level of the destination folder and then
  2. strip the html extension from any file that already has a php extension.

If you tap the Edit button in the Paths to Replace screen, SiteSucker displays a toolbar with the following buttons:

Delete Delete

Deletes the selected paths.

Edit Edit

Allows you to edit the selected path.

Add Add

Allows you to add a new path.