open source

The end result of free - a spate of 80% solutions

October 23, 2005 19:22:30.327

Inessential hits on one of the ugly truths about free software (which is separate from open source) - if a given niche in software goes free, you end up with 80% solutions:

There are some email clients I personally like—Mailsmith and mutt, in particular—but I’m not the first person or the last to say that there is no Ultimate Email Client for OS X. Justin is right in that nobody can afford to create it. Even if you made Pretty Much the Greatest Email Client Ever, it would be hard to compete against Mail.app and gmail and so on. Email clients are like air: people don’t want to pay for something so basic. (Okay, some rare people will.) What’s frustrating is the sense that, by the year 2005, we shouldhave a great email client. It’s not like it’s new technology. It could be done. The problem is the economics.
 Share Tweet This

books

Untold history

October 23, 2005 13:05:01.750

I've been reading "Red Star Rogue" - a history of a Soviet sub that apparently tried to launch a nuclear warhead at Pearl Harbor in 1968. It's an amazing book - I'll have to look for other information on this incident after I finish it. The context of the times is interesting as well - as this rogue strike was being attempted, you also had the Pueblo incident and the Tet Offensive in Vietnam. In other words, lots of bad things happening all at once. Kind of amazing to me to consider just how close things got to global war when I hadn't even reached the age of 7...

 Share Tweet This

web

Get your typos here

October 23, 2005 11:23:36.814

The web is a marvelous thing - sometimes it even immortalizes the otherwise irrelevant typo. Witness "Instant Massaging"

 Share Tweet This

news

Gentlemen, apply your SPF 30

October 23, 2005 11:01:36.421

Spotted in digg NASA has a very sharp picture of the Sun available - check it out.

 Share Tweet This

holiday

And now... Halloween!

October 22, 2005 18:36:44.593

Or a Halloween party, at least. My wife and daughter always go to town for this holiday - here's a small shot of what they put me up to:

Halloween 2005 - Devil

 Share Tweet This

itNews

Who are the rebels?

October 22, 2005 11:13:26.754

I love this InfoWorld article for the sheer lack of self awareness involved. Writing from the "trenches" of IT, we get:

If there's a happy ending, this is it: The evil empire of bureaucracy never realizes that IT is awesomely adept at subverting authority. We attend teleconferences from our desk, where we multitask. We fill out the innumerable forms "they" insist they need by copying and pasting from other documents. We swiftly learn what they consider the right answer and give it back to them as quickly as we can, so we can get back to work.
In short, instead of confronting the enemy head on, we move into guerrilla-warfare mode. Long live the rebel forces!

Sorry to burst your bubble, but for too many of us, the "they" you speak of is IT, and the "rebel forces" are the rest of the organization desperately trying to get work done in the face of mindless IT "standards" that get rammed down our throats.

 Share Tweet This

development

They don't know what they don't know

October 22, 2005 10:17:30.589

I just love this ComputerWorld story on "object based storage" - talking about the idea like it's a new one. Maybe the author of this piece should visit Gemstone or Objectivity. Sheesh:

Serving as a sort of boot camp for scattered data, object-based storage techniques thrive in organizations that need heavy doses of discipline both to appease hovering regulators and strengthen internal data retention and retrieval methods.
Here's how it works: Object-based archiving technology corrals disparate data files - documents, images, video clips or audio files - into content "objects" tagged with metadata to make the information searchable regardless of location. Also called content-aware or content-addressable storage, the technology is still in its infancy but is often hailed as a fast and easy way to pool and manage large data sets.

Back to the future!

 Share Tweet This

itNews

Homebrew on the rise?

October 22, 2005 10:07:35.473

ComputerWorld has a story that runs counter to the conventional wisdom - internally developed software is becoming more common?

Packaged software is getting whacked ...

... by a shift inside IT to develop apps internally. That's the conclusion drawn by Ken Berryman, a consultant at McKinsey & Co. who also spoke at SoftSummit 2005. According to Berryman, New York-based McKinsey in 1998 estimated that 31% of business applications were internally developed. By 2003, that percentage had jumped to 42%, while packaged apps fell from 32% of the mix to 28%, he says. Berryman says he

expects the trend to continue because there is now "a much more standard software stack" for IT, including everything from middleware to network protocols. Plus, he says, development tools are improving.

I wonder if this is a "real" jump, or a Sarbanes-Oxley induced jump?

 Share Tweet This

development

Inertia explains a lot

October 22, 2005 9:33:53.784

Adam Connor brings up the most common reaction I see when a non-mainstream language is proposed:

They do care about price (including long-term maintenance), but there are a lot of other considerations. A brilliant Lisp programmer may produce a more effective, maintainable solution 1, but what if he leaves? Hiring Lisp programmers might be tricky, and thus entails risk. Moreover, most businesses would rather hire a strong business analyst with so-so programming skills than a brilliant programmer with so-so business skills. The reason is simple: the business analyst will produce a pedestrian solution to the right problem, whereas the brilliant programmer will produce an elegant solution to the wrong problem. Or so the thinking goes; of course, the ideal is to get someone who is strong at everything, but they are scarce and priced commensurately.

Seems that learning a new language is a nearly impassable hurdle, at least in the minds of a lot of the industry.

 Share Tweet This

logs

Weekly Log Analysis, 10/21/05

October 21, 2005 22:02:38.402

Time for my weekly look at the logs - the BottomFeeder downloads dropped back from the stratospheric levels they reached last week, to a still respectable 403 per day:

PlatformBottomFeeder Downloads
HPUX672
Windows637
Mac 8/9399
Sources308
Mac X216
Linux x86182
Update175
CE ARM122
Windows98/ME29
Linux Sparc28
Solaris20
AIX16
Linux PPC7
SGI6
ADUX3
Source Script3

Those HP download numbers are a source of constant amazement to me. Off to the html page accesses, where it looks like IE is staging a comeback:

ToolPercentage of Accesses
Internet Explorer45.8%
Mozilla39.9%
Other7.7%
MSN Bot2.3%
Google Bot2.3%
Java2%

I guess the uptick in readership is driving me more toward the average browser usage, which is still heavily IE. On to the RSS pages:

ToolPercentage of Accesses
Mozilla27%
BottomFeeder12.7%
Other12.6%
Net News Wire10.3%
BlogSearch4.8%
Safari RSS4.4%
Planet Smalltalk3.5%
NewsGator3.2%
Magpie2.8%
RSSReader2.5%
Internet Explorer2.4%
SharpReader2.3%
BlogLines2.1%
Feed Reader1.5%
Feed Demon1.5%
RSS Bandit1.2%
Liferea1.2%
Google Bot1%
Jakarta1%
JetBrains1%
News Fire1%

Looks like the RSS aware portion of the audience is still very Mozilla and Mac centric - with the rest of my audience spread across a very diverse range of tools. On the other hand, if you look at the tools owned by NewsGator now, they have 15% of my audience.

 Share Tweet This

itNews

Stupidity vaster than SCO's located

October 21, 2005 18:47:22.511

I wouldn't have thought it was possible, but I think I've spotted a well of stupidity that's actually deeper than the one Darl McBride lives in:

Charlotte, N.C.-based Scientigo owns two patents (No. 5,842,213 and No. 6,393,426) covering the transfer of "data in neutral forms." These patents, one of which was applied for in 1997, are infringed upon by the data-formatting standard XML, Scientigo executives assert.
Scientigo intends to "monetize" this intellectual property, Scientigo CEO Doyal Bryant said this week.

And I thought the boys from Eolas were ambitious.

 Share Tweet This

deployment

Debugging the live server

October 21, 2005 15:31:38.180

Every so often, I make a mistake in the deployment of code from my test server to the production server (the Smalltalk image that runs this and the other blogs here). In those cases, I go back to the methods I updated and look to see what's different. Now, I'd rather not take the server down for this kind of thing, so instead I do something like this and load the patched method into the system:


[self codeThatMightBeWrongHere]
     on: MessageNotUnderstood
     do: [:ex | Transcript show: ex errorString; cr].

Then I do something with the server that will exercise the modified code, and watch the Transcript (scrolling to a file) to see what happens. Once I figure that out, I restore the original code, do more testing on the test server, and deploy the needed fix. All without taking the server down. It's one of the cooler aspects of having a full development system available as your deployed server.

 Share Tweet This

news

ISP trouble, spam surge, what?

October 21, 2005 14:28:27.235

So is this report from Boing Boing some limited ISP problem, a large spam surge, a viral attack of some sort, what? I haven't seen any slowdown myself, so it's rather abstract to me at this point.

 Share Tweet This

product management

Limited Releases aren't

October 21, 2005 14:11:02.427

Steve Rubelis down on Flock (something I haven't seen), but that's not really the point of this post. Here's the snippet from Steve that I want to look at:

Last night I tried out Flock, a new Mozilla-based browser that's getting a ton of buzz. The press is chiming in here too. Originally just a handful of people were invited to try Flock, which is in developer preview. Unfortunately, Flock installers quickly spread around town and the company released it out to everyone to try.

That last bit is important - Flock installers quickly spread around town. The error made by the Flock people is in the assumption that you can make a limited release available on the web. If you have something that well read people are going to look at - and comment on - then it will stop being a limited release within a few minutes. Which might be a problem, depending on whether your stuff was really ready for general release.

Bottom line - there's pretty much no such thing as a limited release anymore.

 Share Tweet This

marketing

Tired of Web 2.0 yet?

October 21, 2005 12:28:18.767

Dare says that "Web 2.0" is meaningless hype, and quotes Joel Spolsky, who has a good post out on the same topic. Here's Dare's summary:

I feel the same way. I am interested in discussions on the Web as a platform and even folksonomies (not tagging) but the marketplace of ideas has been polluted by all this "Web 2.0" garbage. Once again, I've flipped the bozo bit on Web 2.0. Like Joel, you won't see any use of the term on my blog or in items I link to from now on.

I think they are both dead on target with this. There's a lot of buzzword bingo going on, and it looks like people are starting to spend money stupidly again.

 Share Tweet This

management

Rewrites for the heck of it

October 21, 2005 11:59:10.367

Jonathan Schwartz makes the point that rewrites done just for the sake of a new language/toolkit don't make sense - something the entire industry collectively forgot about a decade ago:

Before I receive 2,000 email critiques, you should know my roots are in desktop software. So lest you think I'm coming at this from the perspective of a knuckle dragging big iron computer guy, that's not me.
As a software guy, here's a simple (though often irritating) rule behind user oriented software: The language in which a product is written has nothing to do with the value it conveys.Coming from the company that produced Java technology, that probably sounds a little odd. But it's a simple truth, especially when it comes to users: if the app's no good, it's no good, even if it's implemented in Java. Or PHP. Or Rails.
...
Because rewriting an app simply to use a new toolkit isn't creating value for consumers. Creating an application or service that delivers unique value is what captures users. And the internet gave some developers a tremendous opportunity to deliver unique value - by radically simplifying basic networking, enabling connectivity and community on a truly global scale.

Couldn't have said it better myself.

 Share Tweet This

media

Death of print media?

October 21, 2005 11:48:38.730

PR Differently is pretty sure that his prediction about the demise of large parts of print media is coming along. I'm not about to argue the other side of that one :)

 Share Tweet This

usability

Like weeds, bad ideas just spread

October 21, 2005 11:46:14.021

Via Scoble, I see that the awful idea that is the Office 12 Ribbon is already spreading. Great - I really wanted more wasted screen space everywhere. The Office 12 team that came up with the Ribbon really, really needs to be flogged.

Better yet, they can come to Columbia, and join the local highway department. Those guys are filled with bad usability ideas for the roads, so the Office 12 folks will fit right in.

 Share Tweet This

media

Publishers and Authors - we like obscurity

October 20, 2005 18:11:11.110

Gosh forbid I should be able to execute a search and find a book that I might want to buy - no, publishers and authors seem to like the current system, where finding a reference requires a trip to the library (which most people won't make). Has it occurred to these fools that better searching will lead to more sales? Or do they take the same stupid pills that the RIAA and MPAA are on?

 Share Tweet This

cst

How and why to override a method

October 20, 2005 11:30:39.425

One of the things I bring up here is why overriding base library behavior is often useful - I thought it might be a good idea to give an example of that, and show how you manage it in VisualWorks.

Here's a a view of the Browser, with a method overridden:

Browser Showing Override

I've shrunken the image for space reasons, but in the top right you'll see that the method is red. That indicates that I've overriden a method owned by a different package (WithStyle, in this case) in the posting tool's package. The override is this snippet of code I added:


	(value isNil or: [value isEmpty])
		ifTrue: [value := ParagraphEditor currentSelection].

In the method above, "value" holds a string that will be the default choice in a dialog box that is popped up for data entry. In this case, it's popped up when the user wants to insert an image into the post. I figured it would be nice if the tool remembered the last image that got uploaded, and made that the default choice. As it happened, the WS method was usually leaving that choice blank (thus, requiring me to type it all in).

I spoke to Michael, and he told me that better pluggability there would be a good thing. In the meantime though, I'm stuck - unless I override. So I did that, and you see the way the browser displays it - in a way that is easy to pick out. I can also version this override off separate from the WS code (thus, not perturbing their codebase). Later, after they've addressed this, I simply remove the override (a menu pick in the browser) and adjust my code to deal with their pluggability.

Simple, and it gets the job done - and in the interim - unlike the situation with Java, or C# - I don't have to either wait for the vendor, or write an entire replacement library/wrapper to get around the limitation. Two lines added to an override, and I'm done.

Of course, some people "don't trust developers", so they are happy wearing the chains the vendors hand them...

 Share Tweet This

blog

Mass Comment Deletion

October 20, 2005 10:07:00.998

Spam hasn't been a huge problem on the CST blogs - for one thing, we have obscurity (my own server) on our side, and, for another, the various blocking schemes I've put in place seem to work pretty well. Even so, it's useful to be able to delete multiple comments at once - so that's a feature of the web admin stuff now. If you login, and go to the post editing page, you can select a post, see all the comments, and select which ones (if any) to delete.

 Share Tweet This

movies

Is this a sign of the impending apocalypse?

October 20, 2005 7:50:01.095

Spotted in SCI FI Wire

David Hasselhoff told Australia's Rove Live TV show that he's acquired the film rights to his old Knight Rider TV series and still plans to turn it into a feature film, according to the Moviehole.net Web site.
 Share Tweet This

development

Iterating toward Smalltalk

October 20, 2005 7:48:11.081

I suppose the Lisp folks see the same thing, just a bit slower. The evolution of development languages is clearly in the direction of what Smalltalk is, with the great mass of developers being dragged kicking and screaming, begging not to have C syntax taken away from them. Consider the progress:

  • With Smalltalk gaining popularity in the late 80's, we had the abortion that is C++ introduced, which at least brought (some) OO concepts out to the masses
  • Java came along, which mainstreamed VMs and garbage collection. That was progress, but decent reflection was left out, and the C style was kept
  • Ruby and Python are starting to gain ground (especially the former now, with Ruby on Rails). The syntax is still Cish (at least to my eyes), but we've got closures and decent reflection. Not the power or environment that Smalltalk has, but closer

From the perspective of a Smalltalker, it looks like a grudging acceptance that we were right all along, but the foot dragging and wailing can still be heard. Kind of like a toddler being put to bed...

 Share Tweet This

spam

One more thing on Splogs

October 19, 2005 17:02:22.234

Jeff Jarvis commented on the splogs this weekend (like everyone else), but there are a couple of interesting comments if you scroll down - have a look at what Steven DenBeste said:

Ultimately there isn’t any permanent solution to this kind of thing. Any system which permits anyone to create readable material on anyone else’s system will be abused by spammers.
The only real solution is to assume that a certain percentage of people out there are hostile, and to design accordingly. Automatic trackback, for instance, was always a terrible idea because it assumed universal good faith.

And then below that, this:

By the way, has it occurred to you that it is economically to Google’s advantage to let the current situation persist? As long as Google can keep its own search results clean, then the spam blogs will make everyone else’s search engines useless and thus drive traffic to Google.
Why would Google want to change the situation? Certainly it would be both illegal and immoral for Google to actively work to pollute the search results of its competitors, but I don’t think that benign neglect of spammer abuse of Blogspot is actionable, and it serves the same purpose. Certainly if that’s what Google is thinking, then it’s slimy. But not illegal, and not actionable. And it’s difficult to see why Google would want to expend any significant effort to try to fix the situation.

Certainly food for thought. Also, make sure to walk through to Elliot's post on splog prevalence on BlogSpot - apparently, it's nearly a third (I'd really love to see historical tracking on that!). Kind of blows a hole in Evan Williams' breezy 1% nonsense...

 Share Tweet This

humor

How to deal with telemarketers

October 19, 2005 14:27:56.287

This seems a little harsh, but boy, was it funny. Via ArcterJournal.

 Share Tweet This

management

What was I working on again?

October 19, 2005 11:33:39.564

Jon Udell posted on something I saw, but mostly skipped over a couple of days ago - this "LifeHacker" story in the New York Times. Truth be told, since the Times put the TimesSelect thing in place, I've paid a lot less attention to them - I assumed this story was behind that, and moved along (yet another way that the Times has marginalized themselves, but I digress).

Anyway - the heart of the story is this snippet here:

Lots of people complain that office multitasking drives them nuts. But Mark is a scientist of "human-computer interactions" who studies how high-tech devices affect our behavior, so she was able to do more than complain: she set out to measure precisely how nuts we've all become. Beginning in 2004, she persuaded two West Coast high-tech firms to let her study their cubicle dwellers as they surfed the chaos of modern office life. One of her grad students, Victor Gonzalez, sat looking over the shoulder of various employees all day long, for a total of more than 1,000 hours. He noted how many times the employees were interrupted and how long each employee was able to work on any individual task.
When Mark crunched the data, a picture of 21st-century office work emerged that was, she says, "far worse than I could ever have imagined." Each employee spent only 11 minutes on any given project before being interrupted and whisked off to do something else. What's more, each 11-minute project was itself fragmented into even shorter three-minute tasks, like answering e-mail messages, reading a Web page or working on a spreadsheet. And each time a worker was distracted from a task, it would take, on average, 25 minutes to return to that task. To perform an office job today, it seems, your attention must skip like a stone across water all day long, touching down only periodically.

That's not going to be a productive way to work, regardless of your profession. Jon is hoping for tool support to fix this, but I think that's fundamentally the wrong way to go - and online "status" pages that let other people know how busy you are aren't going to cut it either - people will just blow right past that, deciding that their stuff is more important - how many phone numbers do you run through when you get voice mail for someone on a given line? Do you just stop there? I din't think so :)

Now, I work out of a home office, so the problem of people walking up to me goes away. That leaves the phone (land line and mobile), email, news aggregator, and IM. All of these are easy to ignore, if I want to. The only real way to solve this problem is personal fortitude. You want to focus on a task? Fine - close the door, mount a "do not disturb" sign, and get to work. Technology isn't going to save you here.

 Share Tweet This

events

OOPSLA reporting

October 19, 2005 10:37:44.417

Travis has been posting from OOPSLA

 Share Tweet This

web

The Network isn't a replacement

October 19, 2005 8:38:20.584

There's been a lot of hype recently about Sun and Google coming up with a web based replacement for Office. I've been quietly skeptical about this one for awhile, but this article says everything I wanted to say about it.

The bottom line - the network isn't reliable enough for day to day applications. Heck, that's why I use POP access to gmail - I don't want to have to rely on network connectivity to look at mail. I take planes and trains often enough that being completely cut off from my mail for hours would be a real problem. I don't use web based aggregators for the same reason - I like having offline access (and Comcast makes sure that I have regular, short term offline experiences even while I'm in my office).

The last thing I want is a graphical VT-100 - and I don't need people explaining that reduced functionality and the danger of a complete loss of service is progress, either.

 Share Tweet This

itNews

Trademark search?

October 19, 2005 8:28:34.392

You would think that an outfit as large as Google would do a trademark search before rolling out a service:

A trademark dispute has forced Google to re-brand its Gmail web mail service in the UK. Existing users get to retain their Gmail address (at least for now) but from Wednesday onwards new UK users will be given a Googlemail email address instead.
UK-based financial services firm Independent International Investment Research (IIIR) said its subsidiary ProNet Analytics has been using the Gmail name for a web-mail application since the middle of 2002, two years before Google began offering Gmail accounts to consumers. The email service offered by ProNet, by contrast, is used mainly by investors in currency derivatives.

I suspect that the problem here is a US-Centric one - I bet they looked in the US, saw no problems, and went ahead with a world-wide rollout. This is going to be a big problem for companies over the next little while.

 Share Tweet This

media

Wikipedia and finding truth

October 19, 2005 8:13:19.807

Ryan brings up the Wikipedia quality issue that's been buzzing around lately, and runs smack into the real problem - after noting, via Dave Winer, that everyone has an equal voice on the Wiki, we get to this as the solution:

Identify people who have expertise or knowledge on certain subjects

That's harder than you might think - and it all depends on the subject. I find that Wikipedia is pretty good on historical subjects (at least older ones), and that's because any controversy that may have existed on the subject has passed. For instance - look up Julius Caesar - the history reveals that there have been a number of reversions lately, but the general information looks pretty good - the damage on that page is the garden variety "I'm excited by curse words" sort of damage.

Now have a look at something more recent, and more contentious - the 2000 US Presidential election. Go browse the blogosphere if you think that there's anything resembling consensus on how that went down. I can't see there being a fully objective view of something as controversial as that election for a long time - it wasn't until deep into the 20th century that the 1876 election was viewed with any objectivity, for instance.

So back to the expertise question - how does a "real" encyclopedia deal with this problem any better than Wikipedia does? Take any controversial topic for which varying interpretations exist (i.e., nearly any historical event that happened within the last 100 years) - where do you find experts who have "unassailable knowledge" of some event? The bottom line is, you don't. Let's take a subject I've read a fair bit about recently - WWI. It's long enough ago now that some level of objectivity is creeping in - but it's still colored by subsequent events (WWII, the Cold War) - enough to generate controversy. What's definitive?

And that's just five books I've read on the subject - five books with very different discussions of how (and why) the war was fought. Let's take the encyclopedia up now - how does the entry on WWI address the war? How does it explain the hows and whys? I'll tell you how - it uses the (then current) academic consensus. Is that "correct" in any abstract sense? Who knows? It might be - or it might not be. The reality is, even WWI is still too controversial for there to be a reliable "consensus" view. Which means that the entry - whether it's in printed copy or bits - is just going to be some compromise view.

Exactly how does that differ for Wikipedia and any other work? It doesn't. The reality is, having "anyone" be able to edit doesn't mean that "everyone" will. Most people don't care deeply about any particular subject - the ones with an interest (and, of course, the vandals) will be the ones who show up. With the printed encyclopedia, anyone who's views fall outside the current academic consensus will just get cut out immediately. With Wikipedia, they have a chance to get their take peer reviewed and commented on.

Which leads me to the opposite view from Winer, and Ryan, and most other people - I'll take the Wikipedia approach over the standard. It's far more likely to allow a larger set of views fight it out.

 Share Tweet This

marketing

Shopping behaviors

October 19, 2005 7:40:44.310

Joi Ito has an early look at an AC Nielson study on online shopping around the world - and demonstrates that it differs around the world:

The US is way behind Europe in the amount of online shopping (ranking 11 worldwide), perhaps because mall shopping is so much easier than shopping in a European city. This encourages Europeans to shop online.

I've been surprised by how hard it is to buy things in Europe after 5 pm, that's for sure. Everything closes early - it's not at all like the US, where I expect to be able to get milk (or anything else) at midnight or later. I guess this study shows that it has an impact on how people shop online.

Anyway, interesting stuff.

 Share Tweet This

humor

Spamtalk gets the BBN treatment

October 19, 2005 7:34:28.583

Better Bad News has a hilarious take on splogs. They don't call it that - they're aiming wider :)

 Share Tweet This

humor

You've probably seen this before, but - it's hilarious

October 18, 2005 23:03:52.899

I'm sure most of you have seen this Gene Spafford quote, but I just ran across it, and it cracked me up:

Secure web servers are the equivalent of heavy armored cars. The problem is, they are being used to transfer rolls of coins and checks written in crayon by people on park benches to merchants doing business in cardboard boxes from beneath highway bridges. Further, the roads are subject to random detours, anyone with a screwdriver can control the traffic lights, and there are no police.

There's other good stuff here.

 Share Tweet This

itNews

Sun eyeing Postgres - what does this mean?

October 18, 2005 21:38:12.056

This ComputerWorld story is mostly about Sun's hopes in the software business - which can mostly be summed up by: "We give our software away; why the heck can't we make money that way?"

Well, to add to that exciting suite of revenue makers, Sun is eyeing PostgreSQL:

"We're not going to OEM Microsoft but we are looking at PostgreSQL right now," he said, adding that over time the database will become integrated into the operating system.

That's Loiacono, VP of their software business. So does PostgreSQL stay open source? Does it stay cross platform? Is Sun just going to bundle it, or buy it? This article seems unclear.

 Share Tweet This

humor

Been there, done that

October 18, 2005 20:41:15.021

 Share Tweet This

rss

Measuring the daily reading load

October 18, 2005 18:04:33.370

One of the things that an aggregator allows you to do is keep up with a lot more information flow. As I said earlier today, I subscribe to 315 different feeds (44 of those are search feeds). I figured it might be interesting to see how much new content there is in a day from the non-media, non-search (i.e., mostly bloggers) feeds that I track. So, I opened up a workspace in BottomFeeder and started hacking out a script:


rejects := #('*feedster*' '*blogpulse*' '*google*' '*yahoo*' '*amazon*' '*icerocket*' '*rocketnews*' '*pubsub*' '*blogniscient*' '*digg*' '*sans*' '*infoworld*' '*computerworld*' '*linux*' '*slashdot*' '*wired*' '*rss.com*' '*internetnews*' '*comics*' '*file://*' '*technorati*' '*techrepublic*' '*meetup*' '*memeorand*' '*espn*' '*cnn*' '*extreme*' '*wbal*').
today := Date today asTimestamp. 
basicFeeds := RSSFeedManager default getAllMyFeeds reject: [:each | 
	(rejects detect: [:each1 | each1 match: each url] ifNone: [nil]) notNil].
counts := OrderedCollection new.
basicFeeds do: [:eachFeed | | todays |
	todays := eachFeed items select: [:each | each pubDateString >= today].
	todays notEmpty
		ifTrue: [counts add: eachFeed displayTitle -> (todays size)]].
sorted := counts asSortedCollection: [:a :b | a value >= b value].


It's a pretty simple script - I grab all the feeds, filter out the ones that are either media or search related, and then see which ones have content today. Then I slam the results into a collection, sort by frequency, and do an inspect-it on the results. Unlike those *cough* advanced *cough* languages in the mainstream, Smalltalk lets me do this at runtime, in the running application. Kind of cool :) Anyway, I wrote a quick script to slap that stuff in an HTML table:

FeedPosts
The Corner80
MARS Activity35
Daily Kos31
Public Store28
PCWorld.com - Latest News Stories26
Bob Congdon25
Sam Ruby's Comments22
The Doc Searls Weblog22
Taegan Goddard's Political Wire21
ongoing20
Eschaton20
Samizdata.net17
Lambda the Ultimate - Programming Languages Weblog17
VodkaPundit16
Cook Computing16
Instapundit.com16
Microsoft Watch from Mary Jo Foley15
RSS News by CodingTheWeb.com15
Radio Free Blogistan15
Philip Greenspun Weblog14
Dvorak14
Web Things, by Mark Baker 14
Mark Bernstein13
National Review Online11
Exploration Through Example 10
lesscode.org10
PragDave10
Squeak People10
MemoRanda 10
TalkLeft: The Politics of Crime8
Little Green Footballs8
N=1: Population of One8
Sci Fi Wire8
Media Blog8
Sjoerd Visscher's weblog8
Glenn Vanderburg: Blog8
cst7
Power Line6
Scripting News6
java.net Weblogs5
Michelle Malkin5
Sam Ruby5
Science @ NASA5
CincomSmalltalkWiki4
Micro Persuasion4
Traffic3
cst comments3
The Ornery American3
The Indepundit3
Don Park's Daily Habit3
Cafe au Lait Java News and Resources3
Larkware News2
Travis Griggs - Blog2
evhead2
Dare Obasanjo aka Carnage4Life2
Hugh Hewitt2
Captain's Quarters2
Joho the Blog2
Corante Blog2
Scobleizer - Microsoft Geek Blogger2
Derek's Rantings and Musings2
Alice Hill's Real Tech News - Independent Tech2
Daypop Search - BottomFeeder2
Software (Management) Process Improvement1
Joi Ito's Web1
Mark Watson's opinions on Java, AI, semantic web, and politics1
d2r1
planet squeak1
Chris Pirillo1
The Fishbowl1
Rob Fahrni, at the core.1
The Blog Ride1
Windley's Enterprise Computing Weblog1
Better Living Through Software1
Steve Shu's Blog1
Industry Analyst Reporter - Applications and Software News1
Workbench1
Matthew Yglesias1
WCBS 880: Yankees on WCBS1
cut on the bias1
Austin Bay Blog1
The Belmont Club1
PVRblog1
ScrappleFace1
The Doctor is in1
ARs closed Activity1
Sam Gentile's Blog1
Panopticon Central1

Now, I didn't get all the non-blogs out, but that's good enough for now - it's down to 89 feeds that way. The MARS one warrants some explanation - it's the feed off our internal bug tracking system, and we are approaching full code freeze for the next release - so activity is high. Other than that, the real outliers (i.e., lots of posts in a day) are group political blogs. Some of the high numbers are also some kind of server reset of the feed, not actual new content. That's still a problem that can fool an aggregator - especially when the feed in question doesn't have ID's for the items.

Anyway - looking at "real" results, it looks like a dozen new posts is a lot - most people are well under that. In fact, if I filter the list to those who posted 10 or fewer times so far today, I get down to 63 feeds. It turns out that the 7 (8 after this one goes up) posts today put me up near the top of that list. In fact, 23 of the feeds only have one new item so far today.

So - if you skim the high volume news/search feeds, the posts on single author blogs aren't that hard to keep up with. At least not if you use an aggregator :)

 Share Tweet This

web

Attention?

October 18, 2005 16:48:46.511

I'd be a whole lot more interested in Gillmor's blathering about attention if the site he flogs actually described what it does. It lets loose with a lot of buzzwords about my rights, and then asks me to join something. Umm, yeah - I've gotten the same pitch in gosh knows how many spams and junk mails too, Steve.

Here's a tip - you want this to go anywhere? Have the page that allegedly describes how great this stuff is actually tell me what the heck it is. In the meantime, I'll just use links. They make sense, and I don't need a set of buzzword bingo cards from a website to figure them out.

Bonus clue - I don't need to hand some website with less than no information on it my email address, name, and url to link.

 Share Tweet This

development

Be careful what you wish for

October 18, 2005 14:18:52.270

Ted Neward explains why he doesn't like wide open dynamic support in a language:

First, the technical: dynamic languages may choose to expose moremeta-control over the language, but there's nothing inherent in the dynamic language that requires it, nor is there anything in a static language that prevents it. Languages/tools like Shigeru Chiba's OpenC++or Javassist, or Michiaki Tatsubori's OpenJavaclearly demonstrates that we can have a great deal of flexibility in how the language looks without losing the benefits of statically-typed environments. So to attribute this meta-linguistic capability exclusively to dynamic languages is a fallacy.
Secondly is the cultural issue: is the idea of granting meta-linguistic power (known as meta-object protocol, or MOP) to a language a good thing? Stu asserts that it is: "My concern is who controls the abstractions. Developer-oriented languages (like Scheme) give a lot of control (and responsibility) to developers. Vendor-oriented languages (like Java) leave that control more firmly in the hands of the vendor." So in whose hands are these abilities to change the language best placed?
*deep breath* I don't trust developers. There, I've said it.

Well, I'll take the contrary view (what a shocker!) - I don't trust the vendors. And I say that as the Product Manager for Cincom Smalltalk. When a vendor ships you a set of tools, you get the viewpoint of their developers as to how things ought to work. If that set of tools isn't malleable, you're just stuck. Hit a wall because the library isn't suitable for your needs? Too bad, you now have to argue with the vendor. Bearing in mind that you might not win.

Think that it'll be easy in the "obvious" cases? Heck, I'm the flipping Product Manager here, and I allegedly set direction - do you think the engineers buy everything I raise as a needed core library change (I've done a small number of them for BottomFeeder)? Heck no - how far do you think you'll get with Sun or MS?

The alternative is what you see in lots of Java projects - one more wrapper around the (insert your favorite example here, like String) class, because Sun decided to seal that one. It's just more pickaxe and shovel work to plow through, because it's simpler to not trust the developers. As opposed to those *cough* godlike *cough* library developers.

 Share Tweet This

itNews

Why did Ward leave MS?

October 18, 2005 9:51:48.179

Interesting piece of news here - Ward Cunningham (father of the Wiki) has left MS to be an Eclipse evangelist:

Microsoft Corp. has lost one of its high-profile hires to an open-source consortium. Mike Milinkovich, executive director of the Eclipse Foundation, announced on Monday that Ward Cunningham is leaving Microsoft to join the staff of the open-source tool consortium. Cunningham's new title is Director of Committer Community Development. Cunningham, the father of the Wiki concept, joined Microsoft about two years ago. At Microsoft, he was not involved directly in social-networking-software development.

That last bit is interesting - what did Microsoft want Ward to do, if they weren't going to have him work in the social software world?

Update: Dave Buck noticed this yesterday. Have a look at the quote Dave pulls - that's a pretty low level of excitement for an evangelist, IMHO.

 Share Tweet This

cst

Lots of Little Processes in VW

October 18, 2005 9:43:42.733

One of the things people ask about a lot is the VW process model. The simple answer is that the VM is single threaded, so all VW processes are managed at the Smalltalk level - i.e., they aren't OS level threads. You can create OS level threads, but only in the context of threading an external API call. A good example of this can be seen in the various database connects that we ship - you'll note that we ship threaded and non-threaded (i.e., blocking) versions.

So given that, what are Smalltalk level processes good for? Well, bear in mind that you (as the developer) have full control over their semantics. That means that an application deployed on Windows will run exactly like one deployed on Linux (or Mac, or Unix) - a VW process is a Smalltalk artifact, so it's not going to be unpredictable. Let me walk through a simple example, using the BottomFeeder update loop. I subscribe to 315 feeds at the moment, so when the update loop fires, I get 315 VW level processes doing HTTP queries. If those were all OS level threads, the system would fall to its knees in seconds - I'd have to use a thread pool. Incidentally, I implemented one as an option for Bf - but I digress. Here's the main update loop (somewhat simplified for space reasons):


feedsToUpdate do: 
			[:aFeed | 
			| updater delay |
			updater := 
					[self 
						updateFeed: aFeed
						shouldForce: shouldForce
						totalFeeds: numberOfFeeds].
			self settings runThreadedUpdates 
				ifTrue: [self runThreadedUpdateFor: aFeed updateBlock: updater]
				ifFalse: [updater value].

			"other code here..." ].

If I have threading turned off (useful on slow connections, where I don't want the queries competing for bandwidth), I just iterate over the list. The interesting piece is in the threaded updates:


runThreadedUpdateFor: aFeed updateBlock: updater 
	self settings shouldThrottleThreads
		ifTrue: [self runWithThrottling: updater for: aFeed]
		ifFalse: [self runWithoutThrottling: updater for: aFeed]

That checks another setting, which controls whether the app should use a thread pool or not. The "throttling" code implements a pool, the non-throttled code just keeps forking off threads. That's how I run Bf, and it works fine (with a fast connection). Drilling to the throttled code:


runWithThrottling: updater for: aFeed
	self updateCounter addProcess: updater atPriority: self settings getUpdateLoopPriority.


addProcess: aBlock atPriority: aPriorityOrNil
	"add the process to the wait pool"

	self sem critical: [self waitingCollection add: aBlock->aPriorityOrNil]


That code simply adds the new process to a queue, which runs a limited number of processes at once. The non-throttled code?


runWithoutThrottling: updater for: aFeed 
	| proc |
	proc := updater newProcess.
	self updateCounter addThread: proc url: aFeed url.
	proc priority: self settings getUpdateLoopPriority.
	proc resume

Now that demonstrates something useful about the level of control you have over a VW process. I'm setting the priority of the process (by default, it's in a range from 1-100, with 8 "named" levels). Then I'm resuming the process. A VW process is defined simply as a block (the snippet all the way at the top) which later gets forked off. In this example, I'm setting the priority and then resuming (forking) the process. I'm also holding a reference to the process, so that it can be killed (for instance, if you take BottomFeeder offline, the system goes ahead and whacks all the in progress threads in that loop, along with the update loop itself).

The priority levels I mentioned are used by the default process scheduler - which is written in Smalltalk. What does that mean? It means that you have full control over the way processes run in Smalltalk. The default model runs the highest priority process that is ready to run, but - at a given priority level - no process will preempt another of the same priority. In other words, it's not time-slicing. Say you wanted it to be? Well, that's simple - to timeslice a given set of processes, you simply have a higher level process manage them (which is what my throttle does to some extent). If you want to timeslice the entire system? Have a look at class ProcessorScheduler and change the way it manages things.

It's a nice system, and it gives you a very high level of control over how your system runs.

 Share Tweet This

spam

Looking at the damage again

October 18, 2005 8:51:41.276

Now that it's been a couple of days, I thought I'd have a look at my search results again, and see how recent (more valid) results have replaced the earlier splog ridden stuff. I posted some code and a table showing the damage a few days ago - let's see what's happened since:

Feed Title Total Items BlogSpot Items Splog Percentage
IceRocket: "VA Smalltalk" 80 10 13
IceRocket: "Squeak Smalltalk" 80 21 26
BlogPulse: "Squeak Smalltalk" 29 15 52
IceRocket: BottomFeeder 80 62 78
BlogPulse: Cincom 80 34 43
BlogPulse: BottomFeeder 80 14 18
IceRocket: "Dolphin Smalltalk" 80 15 19
Feedster Smalltalk 80 27 34
Google Blog Search: BottomFeeder 80 34 43
Feedster: VisualWorks 80 10 13
BlogPulse: Smalltalk 80 27 34
Feedster: Cincom 80 28 35
BlogPulse: Dolphin Smalltalk 57 19 33
BlogPulse: "Cincom Smalltalk" 57 17 30
IceRocket: Cincom 80 40 50
Technorati: BottomFeeder 80 30 38
PubSub: Smalltalk 80 74 93
BlogPulse: VisualWorks 80 24 30
Technorati: "James Robertson" 80 46 58
Feedster on: "James Robertson" 80 29 36
Technorati: Cincom 80 45 56

If you compare those numbers to the earlier ones, you'll see that they are trending down - the various engines have started responding to the problem. The results coming out of Feedster in particular are better - they seem to have done a pretty good job of weeding stuff out. Some of this, of course, is Google weeding out the splogs too. Also, those numbers are high due to the way BottomFeeder caches - all the old bad results are still setting there, marked as read (i.e., ignored by me, but still in my data set. Let's modify the original search so that I'm only looking at results that have arrived on Monday or today. I've also relaxed the number needed to show "badness" from 10 down to 2, given the smaller data set:


| folder mgr dict |
folder := RSS.RSSFeedViewer allInstances first feedTree selection.
mgr := RSS.RSSFeedManager default.
feeds := mgr getAllFeedsFrom: folder.
dict := Dictionary new.
cutoff := Timestamp readFrom: '10/17/05' readStream.
feeds do: [:eachFeed | | matches all |
	all := eachFeed items select: [:eachItem | eachItem pubDateString >= cutoff].
	matches := all select: [:eachItem | 
					eachItem link 
						ifNil: [false]
						ifNotNil: [('*blogspot*' match: eachItem link)]].
	matches size >= 2
		ifTrue: [dict at: eachFeed displayTitle trimBlanks put: (all size -> matches size)]].


Simply adds a cutoff date of Monday at midnight. So what's been hammered since then?

Feed TitleTotal ItemsBlogSpot ItemsSplog Percentage
IceRocket: Smalltalk8056
PubSub: Smalltalk292483
IceRocket: Cincom37616
Feedster Smalltalk38616

That's a much smaller amount of damage - although, it looks like PubSub's matching algorithm is particularly vulnerable to this sort of attack.

 Share Tweet This

itNews

Rule Driven Changes in IT management

October 18, 2005 7:54:17.257

InfoWorld reports that the rise of the CIO is over - the job is being done in by the reporting requirements of Sarbanes-Oxley:

Because IT has a close relationship with all of a company's data stores, it provides everything the CFO wants to look at, including statutory reporting and analytical capabilities. At Merial, all senior IS directors now report to the CFO.
"Have information on a timely basis, with an audit trail, is what [Sarbanes-Oxley] required us to do. Everything must be traceable to the source," Lerner tells me. While IT is responsible for satisfying the needs for compliance, the CFO is the gatekeeper. So, in January, no more CIO.

I'm not sure what that means in the bigger picture - but it might be a good thing. I've seen an awful lot of projects that went on and on (long after they were obvious failures) solely because IT management couldn't stand up and deal with admitting it. With Sarb-Ox requirements and the CFO on the line, maybe there will be less. Or, human nature being what it is, maybe not :)

 Share Tweet This

blog

Blogs and usability

October 18, 2005 7:41:31.648

Jakob Nielsen has a list of dos and don'ts for blog authors. Some of them matter more than others, and some depend a lot on the context of your blogging. For instance - the "own your own domain name" one.

That depends on what you are trying to accomplish. Me? I'm evangelizing Cincom Smalltalk (and ranting about things in the industry that cross my view). Given the evangelism aspect, it makes sense more me to be blogging on a Cincom server - my goal is to build the Smalltalk community. How important the domain is depends a lot on your goals.

Another one of his suggestions popped at me as problematic as well:

Many weblog authors seem to think it's cool to write link anchors like: "some people think" or "there's more here and here." Remember one of the basics of the Web: Life is too short to click on an unknown. Tell people where they're going and what they'll find at the other end of the link.

You have to "go with the flow" of blogging on this one. An awful lot (most) of what is written on blogs is extremely temporal - it's very much based in the now. Which means that for the person reading about the latest kerfuffle (technical, political, whatever) - they get the context. It's very unlikely that anyone will care that deeply in a month, much less a few years.

Much of the rest of what he wrote is good stuff though - have a look, and see what you think. In this area, it's definitely a YMMV thing.

 Share Tweet This

web

Gah!

October 17, 2005 19:11:26.367

Tim Bray needs to set up a few search feeds:

Dave’s numbers suggest that there’s less there than meets the eye; that the numbers and reach of splogs are limited. It’s just that their automated content generation managed to cause them to fill up the ego feeds of a bunch of loudmouthed widely-read bloggers, who all screamed simultaneously.

The example I posted noted that searches for Smalltalk (as in, the programming language) got flooded. I would have to assume that Java searches were flooded the same way (or worse, given the larger number of potential readers). No, we aren't all interested only in ego searches. Some of us just want to see what's being said on topics of interest.

 Share Tweet This

web

Nice idea, but...

October 17, 2005 15:50:02.114

Tim Bray wants to move to an "internet stamp" system in order to eliminate spam. It's a nice idea, but it will never work. Why? Well, what do you do if a bunch of domains decide to offer internet stamps for free? Just knock them off the net? Yeah, that'll go over well. There's another problem too - it requires a long grace period followed by a cutoff date - after which older clients will just stop working. Yes, I can sure see that happening too. How do you plan to manage an enforced upgrade across every platform on the net?

There's an even simpler problem. Let's say the cost is a penny a transmission, as Tim posits. This assumes a robust micropayment architecture (which doesn't exist). It also assumes that at that cost, spamming is prohibitive.

Hmm - that's $10,000 to send a million messages. Based on the kinds of revenues that spammers are supposedly rolling in, I suspect that this will be less of a disincentive than Tim thinks. People pay astounding amounts of money to put 30 second spots on the superbowl - spending $100,000 to put 10 million spam messages out just doesn't sound that wild to me. Not to mention the enormous pressure on governments to make unsolicited mail legal once there's tax revenue to be gleaned from it. No doubt you've seen the huge efforts governments take to stop unsolicited snail mail?

Thanks, but no thanks. Take that solution and just bury it.

 Share Tweet This

general

Ready to sleep now?

October 17, 2005 13:06:43.365

Real Tech News has disturbing info on what's in your pillow:

Fungal contamination of bedding was first studied in 1936, but there have been no reports in the last seventy years. For this new study, which was published online today in the scientific journal Allergy, the team studied samples from ten pillows with between 1.5 and 20 years of regular use. Each pillow was found to contain a substantial fungal load, with four to 16 different species being identified per sample and even higher numbers found in synthetic pillows.

Sounds lovely :)

 Share Tweet This

BottomFeeder

Finding the Damage

October 17, 2005 9:28:21.296

One of the cool things about BottomFeeder is that I don't have to resort to eyeballing in order to figure things out - I have the full power of Smalltalk in front of me. So, I thought I'd have an objective look at the spam damage from splogs over the weekend. Here's what I did. First, I selected the folder that holds all my search feeds. Then I executed this:


| folder mgr dict |
folder := RSS.RSSFeedViewer allInstances first feedTree selection.
mgr := RSS.RSSFeedManager default.
feeds := mgr getAllFeedsFrom: folder.
dict := Dictionary new.
feeds do: [:eachFeed | | matches |
	matches := eachFeed items select: [:eachItem | 
					eachItem link 
						ifNil: [false]
						ifNotNil: ['*blogspot*' match: eachItem link]].
	dict at: eachFeed title put: (eachFeed items size -> matches size)].
^dict

That resulted in an inspector that looks like this:

Splog Spam Damage

That's a useful view for scrolling through - let's cut things down and create a table that can be easily posted. I'll limit the table to feeds that have at least 10 bad results in them. First, I added a test to the previous script, such that only feeds passing my test get into that dictionary. I have 44 search feeds; 22 of them passed the bad results test. On to the html script:


stream := WriteStream on: (String new: 1000).
stream nextPutAll: '<table border="1" cellpadding="3">'; cr.
stream nextPutAll: '<tr>'; cr.
stream nextPutAll: '<td><strong>Feed Title</strong></td>'.
stream nextPutAll: '<td><strong>Total Items</strong></td>'.
stream nextPutAll: '<td><strong>BlogSpot Items</strong></td>'.
stream nextPutAll: '<td><strong>Splog Percentage</strong></td>'.
stream nextPutAll: '</tr>'; cr.
dict keysAndValuesDo: [:key :value | | total spam percent |
	stream nextPutAll: '<tr><td>'.
	stream nextPutAll: key, '</td>'.
	total := value key.
	spam := value value.
	stream nextPutAll: '<td>', total printString, '</td>'.
	stream nextPutAll: '<td>', spam printString, '</td>'.
	percent := ((spam/total) asFloat * 100) rounded.
	stream nextPutAll: '<td>', percent printString, '</td>'.
	stream nextPutAll: '</tr>'; cr].
stream nextPutAll: '</table>'; cr.
^stream contents

Running that produces the following output:

Feed TitleTotal ItemsBlogSpot ItemsSplog Percentage
IceRocket: "VA Smalltalk"801013
IceRocket: "Squeak Smalltalk"802329
BlogPulse: "Squeak Smalltalk"291552
IceRocket: BottomFeeder806278
BlogPulse: Cincom803443
BlogPulse: BottomFeeder801316
IceRocket: "Dolphin Smalltalk"801620
Feedster Smalltalk804961
Google Blog Search: BottomFeeder803544
Feedster: VisualWorks801013
BlogPulse: Smalltalk803139
IceRocket: Smalltalk803848
Feedster: Cincom802835
BlogPulse: Dolphin Smalltalk571933
BlogPulse: "Cincom Smalltalk"571730
IceRocket: Cincom805569
Technorati: BottomFeeder803139
PubSub: Smalltalk8080100
BlogPulse: VisualWorks802430
Technorati: "James Robertson"804658
Feedster on: "James Robertson"802734
Technorati: Cincom804455

Gives you an idea of the kind of spam attack that was running over the weekend, doesn't it?

 Share Tweet This

spam

Good news from Feedster

October 17, 2005 8:31:32.383

Looks like the Feedster guys lost some sleep this weekend - they've been having a look at the splog problem, and think they have an answer. The volume washing through Feedster results seems to be down, but it's hard for me to tell whether that's because:

  • I've already received all the crap there is to get on the keywords I search for
  • They've addressed the problem
  • The attack has been stopped/slowed/paused

I'm still getting bogus results from PubSub, IceRocket, and BlogPulse though. In any event, I think Google is where the action needs to take place. They're the ones who have the targeted system.

 Share Tweet This

spam

Not getting it

October 17, 2005 7:52:58.754

Evan Williams is apparently so far removed from things that he doesn't see how bad splogs have gotten. The number of "good" blogs on BlogSpot versus the number of splogs there doesn't really matter. At all. What matters is that a coordinated attack using bots was able to render nearly every blog search system irrelevant over the weekend (Blogdigger seems to be an exception).

The splogs on BlogSpot are effectively all that's there now, and Google should have seen it coming - it's not like splogs just popped up this weekend, or like bot attacks are new.

 Share Tweet This

media

Look the word up

October 17, 2005 7:38:45.538

"I don't think that word means what you think it means"

Ahh, fun, Nicholas Carr got tired of saying that IT is dead. So now he’s saying that Web 2.0 is “ammoral.”
Oh, really? Maybe you should check out the Web 2.0 stuff that Brian Bailey is doing. He is putting HDTV videos of his church’s services up. And much more. And he has a blog.

Scoble needs to understand the important difference between "amoral" and "immoral". Carr is asserting that the web is the former, not the latter. On balance, he's not wrong.

 Share Tweet This

spam

Spam blogs go wild

October 16, 2005 14:32:57.410

Tim Bray noticed that spam blogs have just exploded - here's an example - look at the latest results for this Feedster search feed for Smalltalk. There have been 76 new items (some dupes) since midnight. Of those, 4 are actually what I'm looking for (references to Smalltalk, the programming language). Two are false positives, references to "smalltalk", as in speaking. The other 70? All splog results.

It's clear that these are bots at work - the Blogspot templates are the same for all the results. The funny thing is, the products and services being flogged aren't your typical pharmaceuticals and porn - the "offers" are all over the map. This seems to be a well organized and coordinated attack, using a boatload of fake blogs as the delivery vehicle.

Looks to me like it's time for Google to step up.

 Share Tweet This

spam

Spam, the next frontier

October 16, 2005 11:57:39.473

It occurs to me that the next target for link spammers will be del.icio.us. They have a simple RESTful interface, and it's easy enough to set up an account. All I have to do is wait, I suppose

 Share Tweet This

spam

Blog Spam Central

October 16, 2005 11:11:23.844

Chris Pirillo says what the rest of us have been thinking for awhile now:

In the past few days, I've been inundated with an enormous amount of subscribed search spam for designated keywords. To the tune of hundreds, if not THOUSANDS, of bunk entries. Who knew "lockergnome" and "pirillo" would be THAT popular?! Still, I can't help but think that others are having the same headaches - and 99% of the crap coming in is directly from a single domain: blogspot.com. Google, it may have been a smart acquisition in the beginning, but y'all need to clean house in a big way. You're the tallest nail, and you're really getting pounded - and now others, who aren't even using your service, are getting pounded. Blogspot has become nothing but a crapfarm, and your brand is going to go down with it. If your motto truly is to do no evil, then you need to start putting some resources behind an effort to curb this train wreck.
I don't know what's (specifically) making it so insanely easy for these spammers to get signed into your system, but you need to change that - ASAP. Forget about developing another Web-based aggregator for now (sorry, Shellen - Blogspot needs more help at this point). I'd love to ban / filter anything and everything that comes from blogspot.com, but the problem is that I have quite a few friends on that service who are sitting in the 1% "legitimate" minority.

As to why spammers go for that system, that's the simple part. It's free, and the signup process can be easily scripted. Which means that you can bot the whole thing, and create a universe of splogs within a few minutes. It's been a disaster waiting to happen, and now it's happening in a huge way.

Update: The sheer volume of these things is amazing. Every one of the blog search systems I use heavily for feeds - Technorati, BlogPulse, IceRocket, PubSub, Google Blog Search - they all get gamed by these splogs. Blog search just got a lot harder.

 Share Tweet This

tv

Lost is screwing with our heads

October 16, 2005 1:12:38.975

We missed "Lost" on Wednesday - our misbehaving ReplayTV was on the fritz. So, we caught up this evening. In the middle of the episode, Hurley starts talking to Rose about his problem (the food found in the hatch). Jack walks by, and says hi to her.

At this point, the wife and I are saying - whoaaaa - she's dead! She died last season, I thought. I went hunting around the episode guides, and found this - "White Rabbit" last year, episode 5:

Jack is nearly delirious from lack of sleep and struggles to overcome the haunting events that brought him to Australia and, subsequently, to the island. Meanwhile, Boone gets caught in a treacherous riptide trying to save a woman who went out swimming. A pregnant Claire's health takes a bad turn from lack of fluids, and a thief may have stolen the last bottles of water. Veronica Hamel guest-stars as Jack's mother, Margo. Also, Jack (Matthew Fox) flashes back at 12 years old, to find himself on the playground in an altercation with a bully, who ultimately beats him up, and later learns a life lesson from his father.

I'm certain that the woman who Boone couldn't save was Rose - and now here she is, back in the flesh, and no one remembers that she's dead? What's up with that? Did I miss something, or is this a clue that will become plain later?

 Share Tweet This
-->