File Names

In most cases, SiteSucker uses the last path component of the URL being downloaded for the file name, and the other path components for the enclosing folder names. For example, when downloaded http://www.example.com/directory/home.html, SiteSucker will save the file at www.example.com/directory/home.html in the destination folder, or specifically, SiteSucker will do the following:

  • Create a folder named "www.example.com" in the destination folder.
  • Create a folder named "directory" in the "www.example.com" folder.
  • Create a file named "home.html" in the "directory" folder.

If a URL ends with a '/', the file is given the name "index" with the appropriate file extension (usually html). So, when downloading the URL http://www.example.com/directory/, SiteSucker will save the file at www.example.com/directory/index.html in the destination folder.

If a URL does not end with a '/' or a file extension, SiteSucker considers it to be ambiguous. By default, SiteSucker will get the file name from the last path component of an ambiguous URL and will add the appropriate file extension (usually html). For example, if the URL is http://www.example.com/directory, SiteSucker will save the file at www.example.com/directory.html in the destination folder. However, if the Treat Ambiguous URLs as Folders option is on in the General settings, the same URL will be saved at www.example.com/directory/index.html in the destination folder.

By default, If the server response includes an HTTP Content-Disposition header with a filename directive, SiteSucker will get the file name from the filename directive. This behavior, however, can be overridden by turning on the Ignore Filename in Headers option in the General settings.

Any characters in file or folder names that are not permitted by the Macintosh Operating System or that can cause problems loading a downloaded file in a web browser will be replaced with a '_' character. And any file or folder names that are longer than 255 characters will be truncated to 255 characters.

Finally, you can use regular expressions in the Replace table in the Path settings to replace the normal path or name of a downloaded file with a different path or name.