The business of Search covers a wide area: web search, mobile search, video search etc - you'll find that and all the techniques, and intricacies associated with it here.

Dashes Or Underscores? Does Google Care?

Thread Title: Domain Name Dilemma: Do Dashes or Underscores Goose Google Rankings More? Thread Url: http://forums.digitalpoint.com/showthread.php?t=1811 Thread Description:

John Gergye over at Digital Point forums outlines a test he has used to conclude that the choice between using dashes or underscores in your page/domain/directory titles does effect your Google ranking.

Yahoo! Speaks out on Site Match Conspiracy Theories

Thread Title: Sitematch: Organic Rankings Suicide? Thread Url: http://www.threadwatch.org/node/610 Thread Description:

Tim Mayer clears up some issues concerning the Site Match PFI system. Many webmasters are confused by Site Match and theories that once your budget runs out or you stop using the service your organic rankings drop abound on the SEO forum circuit. He had this to say in the post threadlinked above:

We made a slight change a while back as we have been really focused on comprehensiveness at Yahoo. If we dont know about your site and you submit it to Site Match you will be suggested for inclusion in the main crawl and go through all the usual quality determinations to figure out if and to what representation you will appear in the main crawl index. The conspiracy theories that we will knock the rest of your site out of the index or that we will delete it if you dont pay are just not true and if anyone experiences this they should send me a message via forum mail documenting this. The issue is usually a content issue and people usually try to blame Site Match for their problems.

Become.com

Thread Title: Upcoming Become.com Crawler Out on the Prowl Thread Url: http://forums.searchenginewatch.com/showthread.php?t=3105 Thread Description:

Update: Sorry, i messed up, if you're looking for the public beta post it's right here - shit happens when your tired heh...

Michael Yang's new shopping search engine Become.com has been spotted out in the bush by Marcia at SEW.

Michael blogged about the launch and mentioned that the beta version is open only to friends and family of become's employee's at present.

Yang co-founded mySimon back in 1998 with partner Yeogirl Yun (become.com's chairman and CTO) who also founded WiseNut. They are currently enjoying some heavy VC backing for thier new venture.

I also found this blog post by Hubert Chen who appears to be a developer for Become.

Now, with the team that bought us mySimon and Wisenut you'd expect this one to be interesting at the very least...

Is Yahoo! WebRank Dead?

Thread Title: Yahoo WebRank Thread Url: http://forums.digitalpoint.com/showthread.php?t=1856 Thread Description:

Some speculation over at the dp forums on whether the beta Yahoo! webrank has been abandoned. The thread dates back to July but has just been reopened. I cant say i've heard or bothered to find out much about WR but the thinking in the thread is interesting.

Is it just a gimmick? Is it worth anything to search marketers?

Theories ranging from it was just crap to sailing to close to the standford PageRank patent are aired in response to earlier failings in WR but the debate continues.

Does anyone have the scoop on WR?

Buying Text Links - Who to Trust?

Thread Title: Who are the best Text Link Brokers Thread Url: http://www.seozip.com/forums/thread221.html Thread Description:

Unless you've had your head in a bucket for the last couple of years you will have noticed that one of the most aggressively promoted and fiercely competed over sectors in the webmaster market is text link sales.

I think Threadwatch has had every major brokerage approach them regarding advertising and you cant miss the banners and promos all over the webmaster forum and blog scene. But who do you trust?

As in many aggressive markets there is a certain amount of distrust over some if not all of the major text link houses. It doesn't mean they're all crooks, but perception is often derived from the way a firm markets itself and in a fiercely competititve sector like this that marketing tends to the dark side :)

Seozip have a small thread about who the best firms are, and what makes it worth a visit is the fact that seobook, nandini and Anthoney Parsons all know their stuff and know many of the names behind the brokerages personally.

If you're looking for recomendations, it's a must read.

Search Engine for Hand Written Documents - History Meets Search Tech

Thread Title: Researchers create tool to automatically search handwritten historical documents Thread Url: http://www.umass.edu/umhome/news/articles/7683.php Thread Description:

The Center for Intelligent Information Retrieval department of the University of Massachusetts Amherst have created a manuscript retrieval system capable of scanning and understanding hand written documents.

Imagine the potential of that...

On scanning/searching George Washingtons Personal Diaries

The scanned pages of Washington’s papers can be searched by typing in a word such as “Washington” or “Virginia,” and the program produces a list of ranked pages showing where they appear.

Manmatha says, “Right now, searching a scanned handwritten document is very hard to do. Scanned historical documents are basically images, or pictures, and currently can only be searched if someone manually transcribes the documents or creates and index of their contents. This is time consuming and expensive to do. Given the cost, most handwritten documents are never transcribed or indexed,” Manmatha says. “But there is an enormous amount of handwritten, historical material.

According to Toni Rath, “The basic idea is analogous to searching text documents in one language, say French, using queries in another language, say English. This is usually done by learning models from documents written in both languages. By analogy, our system learns from a parallel body of transcribed scanned images. That is, the word images form a ‘visual language’ and the transcriptions are in English.” Once the model is learned it may be used for searching scanned pages for which no transcriptions are available.

story via slashdot

An Economists View on Click Fraud - Reyes talking Bollocks?

Thread Title: An Economists View on Click Fraud Thread Url: http://weblogs.jupiterresearch.com/analysts/scevak/archives/005264.html Thread Description:

Jupiter analyst Niki Scevak gives an economists view on click fraud in the post threadlinked above.

In light of what Google CFO George Reyes said about click fraud threatening the G biz model Niki's thoughts on the subject make for a good read:

Firstly, click fraud is a bad thing that should be policed and eliminated by the engines and they have no excuse now that they have $50bn market valuations to hire scores of click fraud cops to eliminate it. But it will have zero impact on Google's revenue, or any other search company, and zero impact on the growth of that revenue.

Here's why. Click fraud is already priced into the cost per click. Marketers bid based upon how well the leads that Google and others send them convert into, in most cases, direct sales. That means that if one person out of every hundred buy, and they make $100 per sale then they will spend up to $1 per click. Now out of that 100 clicks, the fact that 50 (gross exageration used for effect!) of them are click fraud is irrelevant. If Google eliminates click fraud then that means that one person out of fifty will now buy, and so the marketer will be willing to pay up to $2 per click now.

The volume will decrease but the cost per click will rise to balance this.

Emphasis mine.

He goes on to say that Reyes would be better off doing his accounting than spouting off about click fraud (paraphrased heh..).

So, is George Reyes just spouting off about stuff he doesn't understand? Probably not eh? If that's the case, why is he making these statements?

Goodbye Search Engine - Hello Sense Engine

Thread Title: Searching Doesn't Make 'Sense' Thread Url: http://turk.internet.com/haber/yazigoster.php3?yaziid=11472 Thread Description:

The threadlinked article above is essentially, as Techdirt point out talking about clustering - like clusty the clown the baggy trousered pie thrower of search.

Crystal Semantics has developed the 'Sense Engine' in order to produce relevant search results by utilising the senses of words, rather than statistical algorithms used by other search technology. Because any word in the English language can be part of a search enquiry, each word is analysed to determine its potential to discriminate which context the search should cover. The 'Sense Engine' identifies all the likely search words, advises the user of the different contexts the search should cover, and categorises the results encyclopedically providing users with results relevant to their request.

The 'Sense Engine' is the result of a six-year search linguistics development programme undertaken by Professor David Crystal, a world authority on linguistics, encyclopedia editor and published author for Cambridge University Press and Penguin Books. £4 million has been invested in lexicographical and encyclopedic research, giving the 'Sense Engine' a classification system of around 2,000 categories derived from an encyclopedia component of over five million words.

This all begs the question, will clustering take hold...?

by the way, the internet.com article above disables right click via javascript. We have a word for people that behave like that in england, starts with W and ends in ANKERS

Overture, Geico Settle Trademark Dispute

Thread Title: Overture, Geico settle trademark dispute Thread Url: http://news.zdnet.com/2100-9588_22-5473231.html Thread Description:

Overture Services has settled a lawsuit brought by insurance giant Geico, ending a battle in an ongoing war over the commercial use of trademarked terms in Web search results.

Geico vs Google suit is ongoing - judge denies Google's motion to dismiss the suit.

Interesting times for the PPC market.

Sitematch: Organic Rankings Suicide?

Thread Title: Msn, Yahoo/inktomi/overture Trusted Feed, And what happens to Organic Crawl data Thread Url: http://www.highrankings.com/forum/index.php?showtopic=10332&hl= Thread Description:

This is an interesting thread, as it shows that even in the minds of some of the more experienced practitioners such as Jill Whalen and ProjectPHP their still exists a degree of uncertainy and cloudiness when it comes to this PFI program. The main question is whether or not you reappear once your budget has expired, based upon your original 'natural' crawl position. Lots of 'possibly's', and 'should's' from David at Trellian, along with a few helpful suggestions.

Sitematch was launched back in May sometime. At the time I read various threads at WMW from confused webmasters grappling to get to grips with whether it was a good or a bad thing.

Questions like

If you submitted to sitematch, what would be the position once your budget was exhausted.

Would sitematch be the kiss of death for an affiliate content website.

What about a site that had an INK penalty, would they be considered under this scheme, would they be included whilst their budget was active this wmw thread threw up all sorts of issues.

I haven't really looked at Sitematch for a while, I dont know if its changed, improved or gotten worse even. At this moment in time, natural crawls (for me at least) seem to cut the mustard, I don't see a need or requirement for it and I don't entirely trust it either. Can anyone point to a definitive position? Is sitematch dead in the water, or has it undergone some mysterious not very well publicised rebirth?

MSN Beta Moving to MSN Main Site?

Thread Title: MSN Beta Slowly Propagating to Non Beta MSN Search Thread Url: http://www.webmasterworld.com/forum97/266.htm Thread Description:

Barry Schwartz is reporting on the wmw thread threadlinked above were members appear to be seeing the msn beta search moving to the main search.

ADDED: Looks like the real deal from where I sit...

What do the Threadwatchers think?

Has the Sandbox been Cracked?

Thread Title: Google Sandbox: solved? Thread Url: http://www.platinax.co.uk/news/archives/2004/12/google_sandbox.html Thread Description:

Brian over at Platinax thinks he may have solved the riddle to sandboxing - Im not convinced, but then to be fair, neither is he :)

It's a long, detailed post that essentially boils down to this: High PR links from a wide range of IP addresses will avoid sandbox.

Ultimately, the issue becomes one of considered linking, and attempting to get links on pages according to PageRank value, rather than sheer numbers - or even topic - first.

This, of course, is precisely what Google wishes to frustrate - so until we can reliably gauge PageRank values of specified pages, then it's going to have to be for webmasters to gauge the value of pages for linking purposes based on a combined judgement of old PageRank data - along with some common sense and creative thinking.

Is the Google Sandbox solved by this hypothesis? I'm not convined it would be wise to claim so - but I do suggest the idea to the wider SEO community as some way to explaining what is actually going on, in a way that makes sense to all SEOs when we use the term "Google Sandbox" and "sandboxing".

As im not a techie seo i'll rely on the good boys and girls at Threadwatch to voice their opinions on this one, go have a read then please tell me what you think?

Google grants Amnesty for Spammers to Help Fix Glitch

Thread Title: Let's Test Hijacking A Google Listing Thread Url: http://forums.searchenginewatch.com/showpost.php?p=25241&postcount=51 Thread Description:

In a surprise move on the SEW forums Googles unofficial representative and Threadwatch member GoogleGuy, a well known figure on some of the bigger search marketing forums has granted Amnesty to Spammers in order to get help.

Google have called for examples of the now infamous Google Results Hijacking scandal that has been buzzing through the Search community this week and last. In a thread where members have tried to get help with this problem the unnamed Google search engineer said:

I'll promise that no spam-related action will be taken based on the reports. If months later, the domain comes up for review for an unrelated reason, then that's a different matter, but I'll instruct whoever collects the feedback to only use it to check out how we pick canonical pages.

The reason for the amnesty is due to the fact that when GG called for examples of the hijack problem none were forthcoming - the technique is known to but a few and is being used almost exclusively in highly competitive categories such as pharma and casino. Nobody in that industry plays by googles guidelines as to do so would be a waste of time in such a cut throat environment.

The results of a page hijacking involve the victim site's position in Google being taken over by another competing site through use of 302 redirects and meta refreshes.

We will update you the story unfolds.

Google Pagerank for entertainment purposes only?

Thread Title: Google Pagerank for entertainment purposes only?!?! Thread Url: http://forums.searchenginewatch.com/showthread.php?t=3054 Thread Description:

I know Google PR has been pretty wonky now more than ever, but for entertainment purposes only? This comes from a thread over as SEW

"The PageRank that is displayed in the Google Toolbar is for entertainment purposes only. Due to repeated attempts by hackers to access this data, Google updates the PageRank data very infrequently because is it not secure. On average, the PR that is displayed in the Google Toolbar is several months old. If the toolbar is showing a PR of zero, this is because the user is visiting a new URL that hasn't been updated in the last update. The PR that is displayed by the Google Toolbar is not the same PR that is used to rank the webpage results so there is no need to be concerned if your PR is displayed as zero. If a site is showing up in the search results, it doesn't not have a real PR of zero, the Toolbar is just out of date"

Are Search Engines Pushing Webmasters to Produce Poor Sites?

Thread Title: I want to give them content but they wont have it. Thread Url: http://forums.searchenginewatch.com/showthread.php?t=3032 Thread Description:

Threadwatch member ukgimp raises some interesting debate over at SEW:

Here is my reasoning. I have spent good amounts of cash on decent content and pretty pictures but Google dont want it!

What do they want, well to me is seems like useless API scraped cack mixed in with paid results. Plus other argy bargey that a handfull of serious people know about

Sandbox discussion aside there are people out there that can create monsters than rank for lots. They may only last a month, but they do OK out of it the unleash another beast.

The thread centers around the fact that it seems, at least to some, that to get good listings you have to break "the rules" and worse, produce sub-standard websites.

I think it makes sense for Google and other SE's to slowly push new sites into a corner - just look at the sandbox fiasco. Some webmasters forget that the business of a search engine, is to make money - They do that by selling adwords and premium listings so are we fighting a losing battle by producing clean, easy to use, information rich websites or what?

Froogle Germany

Thread Title: Froogle Germany Launches Thread Url: http://froogle.google.de/ Thread Description:

Froogle Deutchland has just been launched - Froogle domination seems inevitable and imminent. via Gary

More Approved Cloaking at Google

Thread Title: Working With Google Scholar -- And More Approved Cloaking Thread Url: http://blog.searchenginewatch.com/blog/041201-063855 Thread Description:

Danny Sullivan has a great piece on a pet topic of his, "google and approved cloaking" - it's pet topic of mine too :)

This was lifted from this Google Scholar piece over at the new G Scholar blog.

The second issue was to ensure that the crawler got the full text so they could work their on the full content rather than just the titles and abstracts. A bit of sleight-of-hand at our end ensured that the crawler got what it needed but with the URLs in the Google index being a suitable entry point for an end user.

Sleight of hand indeed...

Danny goes into all the detail you need, get on over there and check it out...

CiteULike - White Paper Cite Search & RSS

Thread Title: CiteULike Tracks Favorite Citations Thread Url: http://www.researchbuzz.org/archives/002185.shtml Thread Description:

Research buzz have a piece on CiteULike. An application that allows you to track white papers and who cites them by tag, author, user and more.

Once you've set up a watchlist you can grab the RSS at the bottom of the page and add it to bloglines or your favorite desktop aggregator.

It looks superb but only time will tell as i track my list via rss - search geeks will want to check out the favorite tags menu on the right and hit the search, algos etc links to build a good list.

Nice catch Tara!

Webmaster Radio - Yahoo Hand Manipulation

Thread Title: Webmaster Radio - Yahoo Hand Manipulation Thread Url: http://www.webmasterradio.fm Thread Description:

Webmasterradio.fm are talking about the Yahoo! Hand Manipulating Results

You cant prove it scientifically, but you can say that hey, im an SEO, i've been doing this for years and I can tell you that that SERP should not be showing up all .orgs!

Talking about the h1, h2 and h0's on certain SERPS in particular areas like spyware removal - A VERY clean result with h=1 parameters in the urls near the top and moving down to h=2 and h=0's

The debate continues on the sense of this hand tweaking and how it's quite understandable in a SERP like the above and for example medications etc - traditionally the haunt of hard core scum affiliates.

Good stuff, first time i've tuned in and im enjoying it immensly - shit, i dont even know who's speaking but one of them sounds remarkably like Todd Friesen (oilman) and im reasonably certain that when i stop waffling on and go check out the playlist i'll find him billed tonight.

If you're not listening, tune in now: http://www.webmasterradio.fm

NuSearch - A Search Engine that Improves with Use?

Thread Title: http://www.searchenginejournal.com/index.php?p=1103 Thread Url: http://www.searchenginejournal.com/index.php?p=1103 Thread Description:

Loren over at sejournal has the scoop on the latest in a long line of new search engines that are only used by search geeks and have very little chance of a real future. NuSearch does have some very interesting ideas but then dont they all?

What else marks out NuSearch as revolutionary? According to Giles Chanot, Chief Software Architect at NuSearch “Well, that Applet that sits in the web page is really the key to the whole show. Every time you perform a search and it downloads some web pages, it compares these with the copies on the NuSearch server. If they’re new or have been updated, this information is sent back to the server (in a highly compressed form). This enables NuSearch to keep its index much more up to date than would otherwise be possible: the more people use NuSearch, the better it gets.”

On the basis that it requires a java applet and im terminally paranoid i'll give it a swerve but you might like to go play with it. If you do, please tell us what you think...