Didn't Like Web Accelerator / Prefetch? - Read On


Well, cacheing reaches new heights with this one:

Laptop comes preloaded with abridged Web

Squeezing Web onto a laptop

From the IHT piece:

The company, Webaroo, plans to announce Monday that Acer, the Taiwan-based personal computer maker, will begin selling laptops furnished with 40 gigabytes of data, representing a snapshot of the Web.

While the full Internet is at least one million gigabytes, Webaroo argues that it has created a means to provide off-line Web searchers with a useful subset of the Internet's vast store of data and information

If you're lucky enough to have them choose you (sic), then just relax:

There is a lot of junk and a lot of redundancy on the Web," said Rakesh Mathur, the chief executive. Even so, the system would strive to deliver sites intact, including their ads.

Right. Offline, unclickable ads. Form an orderly queue.



Bah! Lawsuit waiting to happen.

If we presume for a moment

[devils advocate]
If we presume for a moment that we exclude sites that are monetised via advertising, isn't this a good thing?

By that I mean, any form will still be filed and submitted, the next time they go online, all phone numbers will still work offline, sales

[/devils advocate]

Or am I wrong?


They would essentially be republishing copyright material, unless they only publish GNU material (which is not only unlikely but also a very skewed view of the net and therefor defeats the purpose).

Isn't republishing

Isn't republishing copyrighted material what the main search engines do (via the cached page function) and hasn't that been shown as legal (in the US at least) due to recent rulings.

If we put aside legal issues for a moment ( I Know it's easier said than done but bear with me :) ) isn't (PPC or CPM advertising aside) this system delivering another route to get your words / pages / sites seen by a previously "hard to tap" sector and therefore delivernig an opportunity at monetisation that would have been otherwise missed?

With WIFI spreading,

With WIFI spreading, internet phones etc, inst this idea 10 years too late.

Any Internet Access on The

Any Internet Access on The Tube?

Another reason

To block bots and chant my slogan about stopping the entitlement mentality to anything placed on the net.


Hey Bill and anybody

Hey Bill and anybody else.

Can you think outside the box for a moment and think with me to see if there is any good that "may" come from this? I appreciate it cuold well be mostly bad, but let's think long and hard at all sides of the equation first


I can see an upside. Obviously we wouldn't want PPC ads in these "Web Packs" but there would surely be value for my sites that have a different sort of action, or paid inclusion type review sites, etc. That's just for me as a publisher. Anyone who's taken their laptop out of the the city could see the value added by a huge searchable cache.

The interesting sort of self filtering a software like this provides is that sites that exist primarily to deliver advertisements, are simply not going to exist in the database. If you are truly concerned about it, exclude the bot, or better yet, if your content is being indexed, take advantage of it with custom delivered and situation appropriate advertising measures.

Of course, the more Webaroo gives to webmasters, the more it will get. IMO, they should provide:

  • Easy blocking, URL removal, and expedited removal. Check!
  • Inclusion notification and statistics.
  • A webmaster suite allowing them to customize the cache and protect their content.
  • ...?

I did a little rummaging on the site in the name of research, but what I'd really be interested in is the monetization. Webaroo is ad-supported, and this is a fine line when you're displaying other people's content. I would download the software and report on the ad implementation, but there's no Linux support. Who's up for it?

Outside the box

Yes, I can see an upside, unfortunately it's for them, not for me.

Even if they keep your advertising on your pages what's the point?

I can see ways that offline readers could make information requests and fill out forms offline that get synched when reconnected, but they never mention anything like that. If you're interested in an ad on a page, how will you ever see it unless the page the ad refers to is also in the offline content? That means you have to devise a whole new method of advertising just to meet the needs ot the ScrapeaRoo niche market.


The guy is sitting on the train reading your page and by the time he's gotten to the office the page is read, closed, out of site, out of mind.

Worse yet, do you even get a clue how big your list of readers are with this thing?

Finally, copyright is OPT-IN, not OPT-OUT, which is the problem I have with all these so-called services based on crawling MY SITE is they are never OPT-IN, it's always OPT-OUT, so if you didn't know ScrapeaRoo was downloading your content in the first place most people would be oblivious. Before someone tries to counter this statement ranting about the search engines doing the same thing, I'll counter that claim as I've OPTED-IN 5 of them, nobody else crawls my site, NOBODY.

Besides, this is so 1990's technology as it harkens hack to the day people were burning online ecommerce sites to a CD so that people with real slow dial-up connections could shop offline and then connect just long enough to place the completed order.

Been there, done that, slap a cell modem on the laptop and get real-time content.

if only

If only Bill would stop blocking those bots and starting feeding them crap, like I have asked him to, we could move on from here.

If only Bill would stop

If only Bill would stop blocking those bots and starting feeding them crap, like I have asked him to, we could move on from here.

in fact Bill, save yourself a lot of bother and hassle and simply 301 any of the bots, or other traffic you don't want to me and I'll take care of everything for you :)

I'm nice like that :P

I tried that Jason.

he passed on the offer.

