Search engine commands and tools
Posted: Nov 9, 04, 18:54 techAdmin
Status: Site Admin
Joined: 26 Sep 2003
Location: East Coast, West Coast? I know it's one of them.
Search Engine Tools
Search Engine Commands
You can get specific results in various search engines using the following commands [note: domain.com is used to represent the domain name you are searching for. Substitute your own search terms for 'any text' also.] In general you may get more accurate results for link:, linkdomain:, and allinurl: if you don't use the www, but that may vary search engine to search engine:
site:domain.com - gives all pages for the site in the index
link:domain.com - buggy, doesn't really work anymore.
domain.com -site:domain.com - returns all occurances of domain.com not in domain.com [ the - means 'not', + means 'only'].
-site:domain.com syntax works for Yahoo, Google and MSN too with their linkdomain: and link: commands.
allinurl:domain.com - returns all occurances of domain.com, including from domain.com. I haven't found a way to filter out those results yet.
allinanchor:any text - searches for the occurances of any anchor text you specify, including domain.com
allintitle:any text - returns listing of all search terms given for text in the <title> tag of the HTML page.
allintext:any text - returns results for search terms given that occur in text of document.
related:domain.com - shows what sites google considers to be in your topic area. This may or may not be a slight peek into some of their newer algo tweaks.
linkdomain:domain.com -site:domain.com - works fairly well, no big problems. Shows all backlinks to site in directory. Leave the 'www' part out. -site:domain.com excludes your domain. linkdomain:domain.com gives all links to all pages in domain.com.
linksite:domain.com gives a good rundown of the sites linking to you also, unsure what the differnce between linkdomain and linksite is. However:
linksite:domain.com -site:domain.com gives pretty impressive results, including pages that have not been linked to for almost 1 year. If you add a slash after 'domain.com' you may get more results. Also adding a 'www.' before domain.com will change the results.
link:http://domain.com gives links to homepage of site. Obviously the command:
will give different results, since some links will point to www.domain.com, and others just domain.com
link:http://www.microsoft.com gives 3,510,000 results
link:http://microsoft.com gives only 47,200 results. That's a big difference.
site:domain.com - returns all indexed pages from domain.com. Quality can be exceptionally poor. For a site with about 350 pages, it returned only 19.
inurl:keywords - works like google's allinurl:, all occurances of keywords on pages indexed.
intitle:keywords - like allintitle:keywords with google. All occurances of keywords in title tag of pages indexed.
url:http://domain.com - returns just that domain, a single return. A pretty useless command as they go. site:domain.com is what you would usually want.
MSN - the public beta is going live November 11, 04 according to this article
site:domain.com - all indexed pages. The 'www' will give different results if used.
link:domain.com. 'www' will slightly change results.
MSN operators in development but not fully active:
FileType: restricts documents to a particular filetype.
InAnchor: Like google's allinanchor
InURL: Like google's allinurl
InTitle: Like google's allintitle
InBody: Like google's allintext
LinkDomain:<domain> Like yahoo's linkdomain, finds documents that point to any page in a domain.
Contains: Contains: returns documents that contain hyperlinks to documents with a particular file extension; for example, contains:mp3 returns documents that contain a link to a mp3 file.
Other Search Engine Link commands
Comments on Consistency of Return Values
None of these commands returns similar results [all MSN results from tech preview, 11-9-04]:
Yahoo: 4,930,000 pages
link:microsoft.com or linkdomain:microsoft.com
Google: 150,000 [showing the absuridity of that command even existing in google anymore]
replacing 'link:' with 'allinurl:' gives: 9,120,000 from Google.
linkdomain:microsoft.com -site:microsoft.com with Yahoo returns 24,100,000
link:microsoft.com -site:microsoft.com with MSN beta returns 3,140,092
allinurl:microsoft.com -site:microsoft.com with google returns 223,000
allinurl:www.microsoft.com -site:www.microsoft.com returns 48,700 in google.
Obviously Google needs some work here I'd say.
As you can see, there's a very wide range between the top 3 search companies in how they handle their data. These numbers take large swings day to day, Yahoo's results for example move by several million for microsoft.com day to day.
Comments on how a Site Currently is indexed
One thing that is instantly noticeable with all the main search engines, Yahoo, MSN beta, and Google is that the way they construct their picture of your site is not based on the site's pages, but rather on the links to those pages. Thus for example MSN beta returns roughly 4100 pages for techpatterns.com, mostly because of the individual topic links on these forums. Yahoo returns 19 [site was 301'ed 3 months ago, for old domain yahoo returns 95. Returns for current domain are random as far as I can tell, some old pages, some new. This algorythm cannot in my opinion be taken very seriously currently], while Google gets closest with about 786 pages, twice as many roughly as the site has. However, since the individual topic link pages are blocked in robots.txt, as are the profile and member pages, you can easily see that what a search engine uses to determine how many pages a domain has is not the physical file, the HTML, but rather the links to that file.
Back to top
All times are GMT - 8 Hours