At the recent SEOmoz / Distilled Pro Seminar in London, I gave a presentation on advanced keyword research. Today I’d like to share one of the tips I gave on how to extract your competitors target keyword list, with relative ease using tools we all have at our disposal.
Here’s the pro training slide, just if you’re keen for a sneaky preview of this post:
Predictable, SEO people can be
That meticulous attention to detail SEM practitioners apply to the optimised pages of their big dynamic sites can be quite predictable sometimes. Even really big sites consist of only a few page templates, which makes extracting data from them simple. Let’s use this blog post to learn the basic skills needed to generate an instant keyword list, with no keyword research tools whatsoever. Yet.
It’s been a while since we mentioned Xenu’s Link Sleuth
How did you guess I was going to mention Xenu? Xenu’s a cracking piece of software, enjoyed by SEOs pretty much since the dark ages, and I totally recommend it. While I write this post, Xenu is dutifully crawling a jobs website ready for me to export the data to Excel. Note the “Title” column, an oft-overlooked column in Xenu’s captured data list:
Carry out a crawl of your target website, with the appropriate settings to avoid crawling external links. We’re really only interested in the onsite data, list of URLs and Titles.
Dust off your Excel Skills
Start by exporting the data you’ve gathered into Excel via a TAB separated text file. That’s quite easy if you’re unfamilar with the process, just go to file > export to tab separated file. Import the file into Excel, just like this:
PS – by a handy coincidence, Excel imports text files with the TAB delimiter selected by default. You can just click finish straight up when you’re importing one of these files.
Make a table and work out some formulas
There are some seriously cool functions in Excel for handling text. My favourites, in no particulalr order, are; “LEFT”, “MID”, “RIGHT”, “FIND” and “SEARCH”. An honourable mention goes out to “LEN”, too.
Here’s what they do:
LEFT – Returns the specified number of characters from the start of a text string
MID – Returns the characters from the middle of a text string, given the starting position and length
RIGHT – Returns the specified number of characters from the end of a text string
When you combine the queries above with the numerical output from the queries “FIND” (Returns the starting position of one text string from inside another), and “LEN” (returns the number of characters in a string), you can quickly construct mechanisms to extract and repurpose the keyword data extracted from your site crawl.
Here’s a simple example:
=IFERROR(LEFT(Table1[[#This Row],[Title]],FIND(” in”,Table1[[#This Row],[Title]],1)),”No Jobs”)
If you translated this Excel query into English, it might read:
Return the characters in column “Title” in this row, until the string “in” is found. If there’s an error with this formula, display the text “no jobs”.
Download the example
You can see this example in action in the template file I’ve created for you to download here. Check out the “Optimised KW 1″ column, which extracts generic job category terms from our test crawl data.
Here’s a more complex example, also included in the data
=IFERROR(MID(Table1[[#This Row],[Title]],(FIND(“in “,Table1[[#This Row],[Title]],1)+3),FIND(” | “,Table1[[#This Row],[Title]],1)-(FIND(“in “,Table1[[#This Row],[Title]],1)+3)),”No Location”)
This one extracts the location contained in the title data, which you can see in action in our “Location” column. The query looks a little more daunting, but if you de-construct it, bit by bit – you’ll see that the query is basically the same as the first, using the text “in ” as a starting point and returns characters until the ” | “. As FIND is used to return the position of the pipe, we subtract the starting position of the “in ” to get the number of characters between our two markers – thus extracting our precious location from the title. WIN.
This trick isn’t just for keyword research
If you can improve your skillset to include extracting data, like job titles, locations, stock / gift items, books, car data etc, you can repurpose it or make it “more unique” quite easily, too. Think about concatenating your new data points together in new sentences and title constructions, ready to upload to your own database (laughs evil laugh).
If you’re keen to get the data from other sources, such as search results pages, product listing pages and the like, you could also give importXML a try using Google Docs. Of course you’d probably be keen to get the data into Excel as quickly as possible either by importing it, or editing it within Excel using Google Cloud Connect. Have fun!