File Type

File Type

The File Type pane of the Settings dialog provides the following controls:

Filter

Use settings under this tab to specify which file types SiteSucker is allowed to download. (SiteSucker uses media types to identify different file types, since this is standard information included in the Internet headers.) The following options are available:

  • Allow All File Types - Download all files regardless of file type
  • Allow Specified File Types - Only download the specified file types
  • Disallow Specified File Types - Never download the specified file types

Note: SiteSucker always downloads HTML and CSS files regardless of this setting.

If the Allow Specified File Types or Disallow Specified File Types setting is selected, you can choose from the following options (shown with representative media types):

  • Archives - application/zip, application/tar, application/stuffit, …
  • Audio - audio/aiff, audio/mp3, audio/wav, audio/au, …
  • Images - image/jpeg, image/gif, image/tiff, image/png, …
  • Video - video/avi, video/mpg, video/x-ms-asf, … (SiteSucker Pro only)
  • Custom Types - as specified under the Custom Types tab

Custom Types and HTML Types

If the Custom Types option is selected under the Filter tab, then settings under the Custom Types tab let you specify which media types SiteSucker should allow or disallow.

Settings under the HTML Types tab let you specify which media types SiteSucker should treat as HTML. When SiteSucker downloads one of these files, it scans it for URLs.

Note: You can use the Log Media Types option under the Log settings to determine the media types of downloaded files.

HTMLTypes

To add a new media type to the Custom Types or HTML Types, click the Plus button and enter the new media type.

To remove media types, select the media types that you want to remove and click the Minus button.

To select a custom type (for the Filter setting) or a file type to treat as HTML, check the box next to the media type in the table.


Media Type Replacement

The Replace file type setting allows you to replace the media type assigned by the server to a downloaded file with a different media type. Some sites provide the wrong media type for certain files. This can cause SiteSucker to save files with the wrong file extension or to modify files that should not be modified. You can use this setting to correct the media type associated with a file and avoid these problems.

Media Type Replacement

Enter a URL pattern and a new media type for each media type you would like to replace. If the regular expression pattern matches the URL from an HTTP response, the server-provided media type is replaced with the media type specified in the setting. For a match to occur, the regular expression must match the entire URL. Patterns are evaluated in the order in which they appear in the table, and the order of media type replacements can be rearranged by dragging them in the table. The media type associated with a URL will only be replaced by the first match even if the URL matches multiple patterns.

For example, in the image shown above, SiteSucker is instructed to do the following:

  1. associate the application/epub+zip media type with any URL that has the epub file extension and
  2. associate the application/x-mobipocket-ebook media type with any URL that has the mobi file extension

To add a row to the table, click the Plus button, enter the URL pattern and media type, and press ↩.

To remove rows from the table, select them in the table and click the Minus button.

To modify a row, double-click on a string in the table, enter a new string, and press ↩.