SiteSucker allows you to specify an executable script that will run before or after each downloaded HTML file is analyzed. The script can modify the HTML file as well as return URLs that SiteSucker can download if permitted by the other settings. SiteSucker will accept a shell script, perl script, or python script, but you cannot use an AppleScript.
SiteSucker passes two arguments to the script. The first argument is the path to the HTML file, and the second argument is the URL that was used to download the HTML file. The script can return absolute or relative URLs to SiteSucker by writing them to standard output. The URLs written to standard output should be UTF-8 encoded and separated by newline characters. If the URL is enclosed in quotation marks (") in the HTML file, the URL written to standard output should also be enclosed in quotation marks. This will reduce the likelihood of errors when the URL is localized in the HTML file. If a relative URL is returned, SiteSucker will not bother localizing the URL in the HTML file.
After your script is created, place it in the user scripts folder for SiteSucker:
You can quickly open this folder by clicking the Open Scripts Folder button in the Script pane of the Settings dialog. Before it will show up in the pop up controls, the script must be executable. You can use the following commands in the Terminal app to make the script executable:
cd ~/Library/Application\ Scripts/us.sitesucker.mac.sitesucker
chmod +x script
The Script pane of the Settings dialog provides the following controls:
Use this control to specify an executable script that will run before each downloaded HTML file is analyzed.
Use this control to specify an executable script that will run after each downloaded HTML file is analyzed.
Open Scripts Folder
Use this control to open the folder where your scripts must be located.