Tag Archives: sql

Analysing Inbound Links

The steps outlined below involve the use of Microsoft Access 2007 and SQL. If you don’t have this application, or you are scared of SQL (don’t be) turn back now! Or read on and then suggest an alternative approach and I’ll incorporate that into a future post. I initially tried to analyse inbound links using Microsoft Excel, but Excel limits the number of conditions you can specify in your filter criteria to 2. If this means nothing to you now, just wait until you’ve read the whole post! It will mean even less.

I was looking at a site for sale today. It was two months old and had decent traffic for a site so young. I’m not much of a link builder so I thought I would investigate the site’s inbound links. If the site had a large number of inbound links, then I would consider buying the site and if it didn’t have many then I would forget it and move on.

Get All The Inbound Links Using Yahoo Site Explorer

My first port of call was Yahoo Site Explorer to determine all the links to the site that Yahoo knew about. Note that these aren’t necessarily the links that Google and the other search engines know about, but it’s close enough to get a good idea. I may as well give you the site I was looking at, for demonstration purposes. These are the inbound links, excluding internals. That’s a big list. Straight away I can see that there are links gained from blog commenting on sites like desmondblog, theuniversitykid. I discount these immediately as they are nofollowed and have no value in Google.

Further down the list are a vast selection of links from namepros. So the site owner is accumulating forum sig links too. I’m not interested in these either as they contribute very little SEO value. Also, we both know that as soon as the site is transferred to me, those sig links will be long gone. There is no point in the ex-owner promoting a site that is no longer his.

So, I’ve got a big list of links and I want to remove all the ones that I recognise as blog comments and forum posts. What links remain, may (only may) be of value.

Export The List Of Inlinks From Yahoo Site Explorer

Those wonderful people at Yahoo provide us with the marvelous facility to export the list of links in TSV (tab separated value) format. We can export the list in a file, save it to our pc and then query the file to get just the links we’re interested in.

When you click the TSV link, you’ll be presented with the familiar ‘save file as’ dialogue box. Find a place on your hard drive to save it and ensure that the file extension is .txt and not .tsv.

Import Your Link Data Into Microsoft Access

Now that we have a file full of inbound links, we can import it into Microsoft Access. I created a database called InboundLinks and then imported the .txt file as a table within that database. I kept the name of the database generic so that I can import other site specific data in the future. Cunning, I know.

To import the links, click External Data > Import > Text File (you can now see why we changed the file extension to .txt). Ensure that Import the source data into a new table in the current database is selected and click the browse button to locate and select your text file. Click OK. On the next dialogue box, ensure that the Delimited option is selected. Although we saved our file as a .txt, it was exported as a TSV (tab separated). Click Next and then select Tab as the delimiter that separates the fields. Click Next. We’ll keep things simple and not bother renaming our fields, so click Next again. Let Access add a primary key by ensuring that that is selected and click Next. Either leave the Import To Table field at the default value or give it a meaningful name like ‘OMG I’m going to be rich once I’ve done this link analysis and bought that site‘. Click Finish and then close the dialogue box.

The data is in!

Analyse The Links With SQL

Double click on the table name on the left and then click the Create tab. We’re going to create a query. Click Query Design and if the Show Table dialogue box appears, close it. We’re going to start loosening nuts and bolts with SQL. In the Results category on the Design tab, click SQL. I know for a fact that some of the inlinks are from desmondblog. We don’t want these so I’m going to type a SQL statement that selects all records except those whose link contains ‘desmondblog.com’.

select * from Url_inlinks where field2 not like ‘*desmondblog.com*’

Url_inlinks is the name of my inlinks table and field2 is the inlinks field (or column if you like excel). Once that’s typed in, click Run to see the links our SQL statement selected. This shortens the list slightly but there are still links from namepros in there that I don’t want. In fact, here is the list of sources of links I’m going to exclude:

  • desmondblog.com
  • theuniversitykid.com
  • namepros.com
  • webdesigntalk.net

and here is the corresponding SQL statement that excludes those sources:

SELECT * from Url_inlinks where field2 not like ‘*desmondblog.com*’
and field2 not like ‘*theuniversitykid.com*’
and field2 not like ‘*namepros.com*’
and field2 not like ‘*webdesigntalk.net*’

This shortens the list enough for me to assess the remaining inlinks but, to be honest, I wish I hadn’t embarked on this lengthy, rambling excursion because the links are pretty much worthless! There’s one from the Sitepoint auction page, and a few more assorted blog comments and forum post. All inconsequential, and none add any real value to the site for sale.

Still, this will be my blueprint for future link analysis. I’m going to call it the Link Checking Blueprint.

I know that the Office 2007 suite is an extravagant expense simply for checking links, but I’m told that there is database and SQL functionality in Open Office. Is there an easier way to get a list of inlinks and then omit the useless ones from view? Do you do this and have a better method? Let me know and I’ll steal your idea and write an ebook around it.