File Type
The File Type settings allow you to specify which file types SiteSucker should download and which it should treat as HTML. These settings also allow you to fix incorrect file type information sent from a site.
The File Type screen provides the following settings:
Filter
Use this setting to specify which file types SiteSucker is allowed to download. (SiteSucker uses media types to identify different file types, since this is standard information included in the Internet headers.) The following options are available:
- Allow All File Types - Download all files regardless of file type
- Allow Specified File Types - Only download the specified file types
- Disallow Specified File Types - Download all file types except the specified file types
Note: SiteSucker always downloads HTML and CSS files regardless of this setting. Furthermore, SiteSucker cannot download videos. See Frequently Asked Questions for more information.
If the Allow Specified File Types or Disallow Specified File Types setting is selected, you can choose from the following options (shown with representative media types):
- Archives - application/zip, application/tar, application/stuffit, etc.
- Audio Files - audio/aiff, audio/mp3, audio/wav, audio/au, etc.
- Images - image/jpeg, image/gif, image/tiff, image/png, etc.
- Custom Types - as specified in the Custom Types screen
Custom Types and HTML Types
If the Custom Types option is on, then settings in the Custom Types screen let you specify which media types SiteSucker should allow or disallow. To select a file type to allow or disallow, turn on the switch next to the media type in the list.
Use settings in the HTML Types screen to specify which media types SiteSucker should treat as HTML. When SiteSucker downloads one of these files, it scans it for URLs. To select a file type to treat as HTML, turn on the switch next to the media type in the list.
Warning: If the text/html option under the HTML Types tab is off, then webpages will not be scanned and nothing will be downloaded.
Note: You can use the Log Media Types option under the Log settings to determine the media types of downloaded files.
If you tap the Edit button in the Custom Types or HTML Types screen, SiteSucker displays a toolbar with the following buttons:
Delete |
Deletes the selected media types.
Edit |
Allows you to edit the selected media type.
Add |
Allows you to add a new media type.
Media Type Replacement
The Media Type Replacement setting allows you to replace the media type assigned by the server to a downloaded file with a different media type. Some sites provide the wrong media type for certain files. This can cause SiteSucker to save files with the wrong file extension or to modify files that should not be modified. You can use this setting to correct the media type associated with a file and avoid these problems.
Enter a URL pattern and a new media type for each media type you would like to replace. If the regular expression pattern matches the URL from an HTTP response, the server-provided media type is replaced with the media type specified in the setting. For a match to occur, the regular expression must match the entire URL. Patterns are evaluated in the order in which they appear in the list, and the order of media type replacements can be rearranged by dragging them in the list when editing. The media type associated with a URL will only be replaced by the first match even if the URL matches multiple patterns.
For example, in the image shown above, SiteSucker is instructed to do the following:
-
associate the
application/epub+zip
media type with any URL that has theepub
file extension and -
associate the
application/x-mobipocket-ebook
media type with any URL that has themobi
file extension
If you tap the Edit button in the Media Type Replacement screen, SiteSucker displays a toolbar with the following buttons:
Delete |
Deletes the selected media type replacements.
Edit |
Allows you to edit the selected media type replacement.
Add |
Allows you to add a new media type replacement.