wget Tips and Tricks

Some useful wget features:

Filter downloads by extensions using -A (Accept) and -R (Reject):
```
wget -r -A.pdf http://example.com/
wget -r -R.html http://example.com/
      
```
Recursively traverse the website http://example.com. In the first case, download only PDF files; in the second, download everything except HTML files.
Check for broken links on a site:
```
wget --spider -r -o log.txt http://example.com
      
```
This makes wget behave like a "web spider": it recursively checks all links on the site and logs the results in log.txt.
Download files that require cookies (e.g., Sun JDK) without a browser:
```
wget --header='Cookie: gpw_e24=<VALUE_OF_COOKIE>' '<DOWNLOAD_LINK>'
      
```
- <VALUE_OF_COOKIE> — the cookie value of gpw_e24 (confirms license agreement).
- <DOWNLOAD_LINK> — the URL of the file required by emerge.
You can specify a path and filename, or simply navigate to the desired folder and wget will save the file with the original server name.