Auto Generating Content - A Contentious Issue at Best

5 comments
Thread Title:
Beta Testers wanted for Text Content Development Tool
Thread Description:

There are few tools floating around the web that will generate seemingly natural language text for you. Often such tools are found on academic sites in the context of language research, and others are passed around under the table by SEO's of a darker nature.

Typically used to generate KW rich pages by the thousand and (if the creator of such a site is smart) cloaked against prying competitors eyes. I've seen this used for adsense in particular and for redirecting human SE referals directly to affiliate merchants.

Probably not doing the web as a whole any great harm but certainly not doing it any great good either. My views on such practices tend to run along the lines of "if it will make me money without hurting children or small fluffy animals, deal me in.."

The thread linked above is about such a tool and is requesting beta testers - some of you may like to check it out, some of you may like to air your opinions on such practices.

If so, fire away....

Comments

You should have seen my frown...

...when I read that thread earlier today. "Acceptable grammar", "text content of varying lengths", this thing may be beneficial to your ranking, but keep it out of my serps - please. It's so funny when people write about content as a means to fill pages with keywords to rank high. I've always been of the "naive" type, thinking that internet documents should present information of a kind. Hmm - I guess I'm too old-school to ever achieve something on the web hehe.

Automatic con-content is efficient, clever maybe, but pages/sites like that (and those related) instantly go on my anti-favourites list. ;)

Clarification

Thanks Nick for starting this thread (I think?).

I'm the person looking for beta testers and I can see from some of the comments that some clarification is in order.

I author and produce websites by 'handcrafting' my pages and enjoy good (legitimate) prominence on the main SEs. I also believe that the most relevant pages are ones which are constructed from content consistent with any 'claims' the page may have visa vi: titles, headings, photos, etc.

In the past year, I have coded over 12,000 pages and the tool I have produced evolved from the ever-increasing 'writer's block' which seemed to start creeping in after the first 5,000 pages or so.

If one refuses to use a thesaurus when authoring content on the grounds that it results in unoriginal or uninformative text, they will definitely not value the merits of this tool.

If however, one does rely on a dictionary, theusaurus, and/or some research for common and technical phrases relating to the content being authored, then this tool may be of value to them.

Just as one cannot produce a coherent body of text (with the exception of a ransom note) by simply snipping pieces from a thesaurus, my content building tool is incomplete without the author's original contributions and direction.

After putting together many websites for my clients and myself involving related industries, there becomes an increasing risk of reusing the same old phrases. This is not good from an SE standpoint or from the client's expectations of an original result standpoint. This is how my tool unfolded.

I wanted a handy hierarchical 'phrase builder' which would come up with relevant 'grammatically acceptable' snippets of prose of 'varying lengths'.

The 'auto' part of the content builder is just the starting point - you can use the text as is, modify it, paraphrase it, or 'roll the dice again' if the phrase does not fit the style of what you are working on. The author is responsible for the writing and 'editing' of the finished work. My content builder does not generated pages, nor does it produce blind content 'on the fly'.

The best way to describe the tool in a nutshell is as a 'thesauarus on steroids'.

Rather than claiming gramatical perfection, I wanted to be reasonable in characterising the quality of the result. The dictionaries and algorythms I have developed are pretty good and the text produced is acceptable for human consumption.

By 'varying lengths', I only wanted to point out the tool can be used to produce simple sentences/questions or longer paragraphs - nothing subversive or covert was implied.

In any case, anyone qualified and interested in exploring this concept is welcomed to let me know.

Kindest regards,
-Dino

Thanks for this explanation - Dino...

...and I for one DO recognise the cleverness AND usefulness of a program like this. It's just that I'm sure lots of SEOers use automated tools of whatever quality to build page-upon-page of heavy spam to improve rankings.

As your tool might be of good use to SEO-savvy article writers etc., I fear it will be "misused" - be it your intent or not - by keyword spammers and the like. They do not hurt me, but I can't say I like them... Hence my frown which has still to disappear completely ;-) Kudos for your standpoint though, AND for your programming skills if this thing does what you claim it does.

P.S.: maybe I should sign up - just because...

P.P.S.: sorry that this particular thread has changed somewhat... either it was edited, or people deleted their posts. Maybe you should elaborate there too... Cheers.

Welcome to Threadwatch dcorte

Welcome to Threadwatch dcortez ;-)

I have a question: How come the tool must contact a server? - why not make it downloadable...

Im always a little wary of using any tool where im required to input information that would spell out exactly what area im working in. Im sure you're a trustworth bloke heh.. but I only have your word that the data collected from such queries would not be used in some other way right?

Food for though when finalizing the product I hope..

Thanks for the feedback

Nick, thanks very much for your welcome.

Wit, your comments are well taken.

I agree, as with any tool (text editor, html developer, graphics program, email program) the final result produced by the tool can be appreciated or abhored depending on how the tool was used.

Because the legitimate uses of my tool at least equals the possible abusive uses (just like many of tools we already use), I'm going to let the users set their own paths.

However, this does raise a good issue. I'm not interested in contributing to blatent keyword spam and, to that end, I have implemented some mechanisms to help deter some bad behaviour.

Specifically, I designed my tool to be used by a human. I have taken precautions to limit some of the more obvious forms of abuse by implementing the solution as client/server (to partly answer your question Nick).

Only one instance of the program can be run at a time and bombarding my servers to datamine/scrape my databases using other applications is detered by the way in which data is accessed through the client.

I have developed desktop apps, client/server apps, and server apps.

Before the Internet, stand-alone desktop applications were the only common solution available. Cumbersome updates and compatibility issues were nagging problems.

When the internet started unfolding, I developed client/server apps because IE 1.0 and NN where unable to provide the data validation and locally active (not requiring a server hit) functions. I developed Internet Direct Chat (instant messaging program long before ICQ and the current IM programs) and a client server application for the government which allowed the logging industry to network at all levels to try and get more value added activity on our natural resources.

The custom client managed the front-end of the application as far as behaviour and GUI went. If only browsers could do what we still needed to code standalone clients for.

Then came the 'active' era and web browsers featured JavaScript, Java, activeX and more goodies to allow a web browser to be configured as a 'custom client'.

For intranet projects, the server-based solution using an active browser was a dream come true. I have designed and developed complete solutions for networking within government, and to migrate to paperless (but fully audited) forms systems.

As I started developing more 'private sector' solutions as server-based solutions, I quickly discovered to my terrible disappointment that the levels of abuse where unimaginable (on both sides of the fence - mainstream and especially adult oriented). The number of bright minds worldwide actively applying their skills to malicious endeavours never ceases to amaze or dissapoint me.

My short lived enthusiam about 'active browser' clients evolved into a return the the hybrid client/server model, but with a very different balance.

The main problems of a desktop apps as we know include:
- Compatibility issues
- Installation required
- Minor updates need cumbersome installs
- Large and/or dynamically updated data makes client too big
- Proprietary information is in too many hands
- Trust is required to install a program on one's system

By designing a client app which is very light (an industry equivalent a customized terminal), I can keep (my) valuable information safe and manage the use of my solution to the way in which it was intended.

As much as we don't enjoy installing apps, it still is a flavour of accessing contemporary solutions for our needs. Many of the desktop solutions we use today (ICQ, MSN), could be implemented as browser/server solutions but they aren't for some of the reasons I have indicated.

I would love to offer my solution as a pure web solution, but I'm concerned about the abuse it would get from bots and unfriendly agents relentlessly pecking away at the front door.

At least with a lite client, I can make more of my server resources available to my legitimate user base.

I'm hoping to find some 'mainstream' web developers who recognise the needs my solution tries to address and to work with them to tune my tool/content for their particular niches/industries.

I welcome your comments.

Thanks,
-Dino

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.