Ever had that terrifying feeling you’ve lost your blog? Perhaps your WordPress installation got hacked, or your web hosts royally screwed up with a “database upgrade”. Either way there’s an almost infinite array of reasons to download and backup a copy of your website, and precisely zero reasons to neglect doing it.
If you’re a Linux user, there are lots of guides out there on how to use WGET, the free network utility to retrieve files from the World Wide Web using HTTP and FTP, but no guides to doing so with Windows. Unless you fancy installing Ubuntu or Crunchbang, here’s a handy guide to downloading your site using WGET in Windows.
1) Download WGET
Download and save WGET to your desktop. You can get wget.exe here. I recommend downloading WGET for Windows (win32) from Ugent.be as it’s the most up to date version I could find. For info, you can also get WGET from Brothersoft but avoid the WGET for Windows download page, because their installer doesn’t work with Windows Vista.
2) Make WGET a command you can run from anywhere in the Command Prompt
If you want to be able to run WGET from any directory inside the command terminal, you’ll need to learn about the path command to work out where to copy your new executable.
First, open a command terminal by selecting “run” in the start menu (if you’re using Windows XP) and typing “cmd”. If you’re running Windows Vista go to “All Programs > Accessories > Command Prompt” from the start bar. You’ll see something that looks like this:
We’re going to move wget.exe into a Windows directory that will allow WGET to be run from anywhere. First, we need to find out which directory that should be. Type path into the command prompt to find out:
Thanks to the “Path” environment variable, we know that we need to copy wget.exe to either the C:\Windows\System32\ directory or the C:\Windows\ directory. Go ahead and copy WGET to either of the directories you see in your Command Terminal.
3) Restart terminal and test WGET
If you want to test WGET is working properly, restart your terminal and type:
If you’ve copied the file to the right place, you’ll see a help file appear with all of the available commands
4) Make a directory to download your site to
Seeing that we’ll be working in Command Prompt, let’s create a download directory just for WGET downloads. *If you’re familiar with Command Terminal basics, just skip this step. Change to the C:\ and use md (makedir) to make a directory:
Change (cd site-download) to your new directory and you’re ready to do some downloading!
5) Download your site using WGET
Ok, the fun bit begins. Once you’ve got WGET installed and you’ve created a new directory, all you have to do is learn some of the finer points of WGET arguments to make sure you get what you need.
I found two particulary useful resources for WGET usage. The Gnu.org WGET Manual and About.com’s Linux WGET guide are definitely the best.
To mirror your site:
wget -r http://www.yoursite.com
To mirror the site and localise all of the urls:
wget --convert-links -r http://www.yoursite.com
To mirror the site and save the files as .html:
wget --html-extension -r http://www.yoursite.com
6) Is your WGETing you blocked?
See what I did there? Some webservers are set up to deny WGET’s default user agent – for obvious, bandwidth saving reasons. You could try changing your user agent to get round this. Try er, pretending to be Googlebot:
wget --user-agent="Googlebot/2.1 (+http://www.googlebot.com/bot.html)" -r http://www.yoursite.com
And finally, here’s WGET downloading my website:
On that last note, lots of hosting companies block WGET. Mine included! Took me a while to be able to back my own site up but now, I feel pretty safe that I have backups of the database, the plugins, the images and even the HTML of the site itself. Happy WGETting!