Using Wget to Download a Website
Wget is a great tool for downloading individual webpages or entire websites. View the Wget manual.
When the site is password protected, you might do the following to download the site:
- Download Wget.exe for Windows (there is no installation)
- Get the Cookies.txt Chrome extension
- Log in to the site
- Use the Cookies.txt extension to copy-paste the cookies into a file named cookies.txt in the same folder as wget.exe
- Run a command that looks like this (maybe it’s overkill, but after banging on it for a while, this worked. So there):
wget –recursive –no-parent –timestamping=on –no-clobber –page-requisites –html-extension –convert-links –restrict-file-names=windows –no-parent –load-cookies cookies.txt -e robots=off -U mozilla “https://www.EXAMPLE.COM/WHATEVER/”
I have had a little trouble with the cookies not working correctly. I’m not sure yet if it’s the cookie expiring after a few minutes and not getting renewed or Wget somehow fetching too aggressively. I successfully fetched a site by refreshing the homepage in Chrome (maybe keeping the cookie valid) and fetching from below the top level.