Off to New Jersey
I'm heading up to New Jersey for a customer meeting in the morning - a quick ride up to Newark on Amtrak, and then the same thing back again in the late afternoon. I'll likely be network free until I get back.
I'm heading up to New Jersey for a customer meeting in the morning - a quick ride up to Newark on Amtrak, and then the same thing back again in the late afternoon. I'll likely be network free until I get back.
Ted Neward echos the conventional wisdom on CORBA - that lack of interop killed it:
For starters, Steve Vinoski was a bit miffed at the idea posited by Mark Baker that CORBA failed. Sorry, Steve, I have to say it, but I agree with Mark--CORBA never fulfilled on its intended promise of seamless middleware interoperability and integration capabilities, and certainly not over the Internet in any meaningful way. By the time CORBA began to address some of those issues--firewalls being a big one--the world had already pretty much abandoned both the "distributed object brokers" (the other being COM/DCOM) and were starting to explore HTTP as the be-all, end-all transport protocol.
Lack of interop was never really the problem - I've seen the VisualWorks CORBA broker working against a large variety of brokers, including a few that no longer exist. Ted touches on the right answer - firewalls. WS* succeeded where CORBA failed for a very, very simple reason - port 80. To get a CORBA hookup between two entities, you have to go have a discussion with the IT (and, if your outfit is big enough, IT security) guys and get them to open up a port in the firewall. Their default answer is going to be no, so this takes work. Easier to just forget the whole thing.
Now, take WS*, by contrast. Well - the SOAP posts come straight into the already open port 80, so you don't need to have that talk with IT. This makes it far, far easier for various skunkworks projects to get going before anyone notices them - by which point they might be too important to kill off.
The WS* stack is at least as complex (and, at this point, arguably more complex) than CORBA ever has been. It's no more or less interoperable between service brokers either (technology-wise). It's effectively more interoperable solely because it uses port 80.
Here's an article on tagging problems from September - I meant to comment on it then, but I find myself looking at my flagged posts now that I'm on a train. Oddly enough, that relates to the problem at hand. Here's the scenario that the post lays out as problematic:
Let's say Joe reads a new article about a battery technology breakthrough in the Scientific American. Joe has been thinking about buying a fuel-efficient car lately. When Joe goes to tag the article's web page, he uses the following tags: "battery," "fuel-savings," "car," "future-vehicle." Let's say the article comes with a .gif of a high-level schematic for how the battery works. Joe saves the .gif in his Flikkr account, tagging it with "battery," "schematic," and "fuel-savings."
Eighteen months and many tags later, due to Joe's profession as an engineer at Intel, he has an electric moment and realizes the battery tech breakthrough has more relevance to something he's directly working on, in nano-tech. Given the keywords he chose, will he be able to 1) recall how he tagged the original article, to find it later on or, 2) if he can find it at all, will he be able to easily re-tag the article and the schematic .gif to match the new context in which Joe finds these ideas relevant? I wouldn't bet on either outcome.
That is a problem, and it's one most of us run into a lot. I use del.icio.us to tag posts that I want to be able to find later - I use the tag "cst" for posts that I want to share with people about Cincom Smalltalk. Now, the problem I'm going to run into here isn't the same as the one above - I'm not going to forget the tag. However, over time, I'll tag a whole ton of things that way. Once I have tens (never mind hundreds or thousands) of posts tagged that way, how do I find the needle I actually want in that haystack?
The article suggests that refactoring tools (like Smalltalk's refactoring browser) for tag libraries are the answer. I don't think so. There's a wall of inertia that's going to prevent most people from doing that. Heck, the simpler problem that my title references is that most people won't tag their posts at all. Of the ones that do, a smaller subset will be motivated to refactor.
Don't believe me? Well, let's look at two A-Listers as an example - Scoble and Winer. The former never categorizes a post, and the latter rarely bothers with a title or a category. These two are widely read, and deeply involved in "web 2.0" discussions - and even they can't be bothered to take the minimal amount of action necessary to enable it. How likely do you think it is that the average web user will bother? For your answer, walk into anyone's old video cassette library and see how many of the ones recorded at home actually have a label. The answer will be enlightening.
Here's more evidence: I subscribe to 315 feeds as I write this, and I keep a fairly large cache of old items for each one. Let's trawl through those and see which ones have a category set:
RSSFeedManager default getAllItems size.
That tells me how many items I have sitting in memory. The response? 16,466. Now, let's see how many have no category set:
(RSSFeedManager default getAllItems select: [:each | each category isNil or: [each category = 'None']]) size.
The result there? 10,810. Nearly two thirds of the items I'm tracking have no category associated with them. Now, let's walk back to the web 2.0 discussions where the semantic web heads are trying to decide whether RDF, or OPML, or something else is the best way to make sense of all this. I'll make it simple for them - it just doesn't matter. The problem isn't the one posited in the article - i.e., "how did I categorize that item"? It's "holy smokes, I'm awash in a sea of completely uncategorized plain text!". Before someone chimes in that text search will auto-categorize, I'll point out that engines like Google already do a lot of that - and, as Scoble has been noticing, there are limits to that.
Tim Bray wants to know what the point of splogs is:
I suspect most people never see spamblogs, but let me tell you, there are a lot of them out there and they get weirder and weirder and weirder. I’m actually baffled as to why they exist.
Oh, this is a simple one. Set up a search feed in your favorite aggregator, and then watch what comes back - especially from PubSub and Feedster, which are just filled with those right now.
End of the week, and time to have a look at the logs again - BottomFeeder downloads are back up - looks like last week was the blip. This week: 813 per day:
| Platform | BottomFeeder Downloads |
| Mac 8/9 | 1567 |
| Mac X | 1359 |
| Windows | 799 |
| HPUX | 777 |
| Update | 430 |
| Sources | 339 |
| Linux x86 | 190 |
| CE ARM | 109 |
| Solaris | 28 |
| Windows98/ME | 26 |
| Linux Sparc | 25 |
| AIX | 23 |
| Linux PPC | 8 |
| SGI | 7 |
| ADUX | 4 |
| CE x86 | 3 |
| Source Script | 3 |
Wow, the Mac numbers went way up - I wonder what that's about? Interesting, and the word I hear about the VW VM getting better on the Mac in 7.4 (December of this year) is welcome news, given those numbers. Next, a look at the HTML page accesses:
| Tool | Percentage of Accesses |
| Internet Explorer | 52.5% |
| Mozilla | 40% |
| Other | 3.4% |
| MSN Bot | 2.3% |
| Google Bot | 1.8% |
That's a fascinating jump up in IE hits - traffic has been up, and there was a huge spam wave last week - and most of the spam reports itself as IE. So, I don't think I can read anything into those numbers. Let's have a look at the RSS access:
| Tool | Percentage of Accesses |
| Mozilla | 23.1% |
| BottomFeeder | 15.4% |
| Net News Wire | 11% |
| Other | 8.9% |
| Feed Demon | 4.4% |
| Internet Explorer | 4.3% |
| Safari RSS | 4.3% |
| Planet Smalltalk | 3.7% |
| NewsGator | 3% |
| Magpie | 2.9% |
| RSSReader | 2.6% |
| SharpReader | 2.5% |
| BlogLines | 2.1% |
| Feed Reader | 1.6% |
| BlogSearch | 1.5% |
| Liferea | 1.4% |
| RSS Bandit | 1.2% |
| Jakarta | 1.1% |
| Google Bot | 1% |
| JetBrains | 1% |
| Feed Tagger | 1% |
| RSS 2 Email | 1% |
| News Fire | 1% |
The RSS feed accesses don't show the same IE spike, which tells me that it's almost certainly the spam surge. The variety of tools being used in this space is still huge though - the consolidation that started has not run its course yet.
Scoble thinks an acquisition can change MS' image (which basically means changing its culture). He has it backwards:
Oh, and all it would take to completely remake Microsoft’s image? One acquisition. I hear we have $60 billion in the bank. I don’t want all of it. Just a small percentage. In fact, it’ll cost far less than it cost us to settle with Real to get in this game.
MS is huge, which means that nearly any acquisition will be of a smaller entity. Smaller entities simply don't have the power to change a larger corporate culture. One of three things happens:
Unless the two entities are roughly the same size - in that case, you get what amounts to civil war for an unpredictable period of time, as each side tries to "win". I've seen that one personally, and watched it in customers. It's not pretty, and it doesn't help anyone.
Odds are, MS wouldn't have that, because a merger of near equals is hard to imagine for them. Any smaller entity they get will just be swallowed whole, with the corporate culture enforced over it. Chance of image/culture change for MS in all this? About nil.
Looks like AOL is trying to jump into the blog search game - Steve Rubel has the story:
AOL and Intelliseek on Monday plan to unveil a blog content deal. Sue MacDonald at Intelliseek confirmed that the deal - set to be announced Monday at 7:30 a.m. - will give AOL access to rich blog data that they will deliver to consumers. While MacDonald did not say what specific data AOL will get, one can certainly speculate that it will come from BlogPulse and reside on the new AOL.com site.
I've been fairly happy with the BlogPulse results - they don't seem to contain the volume of splog content that is making Feedster and PubSub less useful every day.
I continue to get requests along the lines of "is Smalltalk safe for the next 20 years?". I've got this post out there, which I think sums things up nicely. However, an article written by John Dvorak illustrated to me again just how hard it is to peer into the near future (much less any further out). I can't find the article in Dvorak's PC Mag archive yet - it'll probably appear there next week. When you look for it, the title is "Computers and Modern Anarchy". Dvorak got into a whole thread about control and anarchy that doesn't interest me a lot - but he did make a point along the way that I wanted to highlight:
If you were running a nexus point or a BBS, you had to have huge banks of modems and multiple phone lines to receive a user on your "site". Most users today can probably no longer configure or use a modem. Dial-up is automatic, and it dials the internet, not each individual target.
Imagine how you surf the web today and realize that before it existed, you had to get the phone number of the site and call it directly each time. There was no hyperlinking; if you wanted to jump from site A to site B, you would have to hang up on one site and dial another. This was standard practice a mere 13 years or so ago.
Think about that - I remember how excited I was about getting USENET access via one of the BBS systems back then - and I remember the large amounts of money a roommate spent on chat too (something that is completely free with AIM, MS Messenger, etc. today). In the early 90's, it was a completely different (online) world.
Now take that forward - what are things going to look like in 15 years? From 1990, I sure wouldn't have seen what's here now. I seriously doubt that anyone sees 2020 clearly either.
Civ III is much more difficult than the old "Civ" (the DOS game) was. I used to play that at the King or Emperor level; I finally managed to win a game at "Warlord" level this afternoon, and I only managed that by staying on Monarchy, relentlessly building military units, and whacking the other powers until they died. How the heck would you win a game via the space race route? I have no clue...
We missed "Lost" on Wednesday - our misbehaving ReplayTV was on the fritz. So, we caught up this evening. In the middle of the episode, Hurley starts talking to Rose about his problem (the food found in the hatch). Jack walks by, and says hi to her.
At this point, the wife and I are saying - whoaaaa - she's dead! She died last season, I thought. I went hunting around the episode guides, and found this - "White Rabbit" last year, episode 5:
Jack is nearly delirious from lack of sleep and struggles to overcome the haunting events that brought him to Australia and, subsequently, to the island. Meanwhile, Boone gets caught in a treacherous riptide trying to save a woman who went out swimming. A pregnant Claire's health takes a bad turn from lack of fluids, and a thief may have stolen the last bottles of water. Veronica Hamel guest-stars as Jack's mother, Margo. Also, Jack (Matthew Fox) flashes back at 12 years old, to find himself on the playground in an altercation with a bully, who ultimately beats him up, and later learns a life lesson from his father.
I'm certain that the woman who Boone couldn't save was Rose - and now here she is, back in the flesh, and no one remembers that she's dead? What's up with that? Did I miss something, or is this a clue that will become plain later?
Chris Pirillo says what the rest of us have been thinking for awhile now:
In the past few days, I've been inundated with an enormous amount of subscribed search spam for designated keywords. To the tune of hundreds, if not THOUSANDS, of bunk entries. Who knew "lockergnome" and "pirillo" would be THAT popular?! Still, I can't help but think that others are having the same headaches - and 99% of the crap coming in is directly from a single domain: blogspot.com. Google, it may have been a smart acquisition in the beginning, but y'all need to clean house in a big way. You're the tallest nail, and you're really getting pounded - and now others, who aren't even using your service, are getting pounded. Blogspot has become nothing but a crapfarm, and your brand is going to go down with it. If your motto truly is to do no evil, then you need to start putting some resources behind an effort to curb this train wreck.
I don't know what's (specifically) making it so insanely easy for these spammers to get signed into your system, but you need to change that - ASAP. Forget about developing another Web-based aggregator for now (sorry, Shellen - Blogspot needs more help at this point). I'd love to ban / filter anything and everything that comes from blogspot.com, but the problem is that I have quite a few friends on that service who are sitting in the 1% "legitimate" minority.
As to why spammers go for that system, that's the simple part. It's free, and the signup process can be easily scripted. Which means that you can bot the whole thing, and create a universe of splogs within a few minutes. It's been a disaster waiting to happen, and now it's happening in a huge way.
Update: The sheer volume of these things is amazing. Every one of the blog search systems I use heavily for feeds - Technorati, BlogPulse, IceRocket, PubSub, Google Blog Search - they all get gamed by these splogs. Blog search just got a lot harder.
It occurs to me that the next target for link spammers will be del.icio.us. They have a simple RESTful interface, and it's easy enough to set up an account. All I have to do is wait, I suppose
Tim Bray noticed that spam blogs have just exploded - here's an example - look at the latest results for this Feedster search feed for Smalltalk. There have been 76 new items (some dupes) since midnight. Of those, 4 are actually what I'm looking for (references to Smalltalk, the programming language). Two are false positives, references to "smalltalk", as in speaking. The other 70? All splog results.
It's clear that these are bots at work - the Blogspot templates are the same for all the results. The funny thing is, the products and services being flogged aren't your typical pharmaceuticals and porn - the "offers" are all over the map. This seems to be a well organized and coordinated attack, using a boatload of fake blogs as the delivery vehicle.
Looks to me like it's time for Google to step up.
"I don't think that word means what you think it means"
Ahh, fun, Nicholas Carr got tired of saying that IT is dead. So now he’s saying that Web 2.0 is “ammoral.”
Oh, really? Maybe you should check out the Web 2.0 stuff that Brian Bailey is doing. He is putting HDTV videos of his church’s services up. And much more. And he has a blog.
Scoble needs to understand the important difference between "amoral" and "immoral". Carr is asserting that the web is the former, not the latter. On balance, he's not wrong.
Evan Williams is apparently so far removed from things that he doesn't see how bad splogs have gotten. The number of "good" blogs on BlogSpot versus the number of splogs there doesn't really matter. At all. What matters is that a coordinated attack using bots was able to render nearly every blog search system irrelevant over the weekend (Blogdigger seems to be an exception).
The splogs on BlogSpot are effectively all that's there now, and Google should have seen it coming - it's not like splogs just popped up this weekend, or like bot attacks are new.
Looks like the Feedster guys lost some sleep this weekend - they've been having a look at the splog problem, and think they have an answer. The volume washing through Feedster results seems to be down, but it's hard for me to tell whether that's because:
I'm still getting bogus results from PubSub, IceRocket, and BlogPulse though. In any event, I think Google is where the action needs to take place. They're the ones who have the targeted system.
One of the cool things about BottomFeeder is that I don't have to resort to eyeballing in order to figure things out - I have the full power of Smalltalk in front of me. So, I thought I'd have an objective look at the spam damage from splogs over the weekend. Here's what I did. First, I selected the folder that holds all my search feeds. Then I executed this:
| folder mgr dict | folder := RSS.RSSFeedViewer allInstances first feedTree selection. mgr := RSS.RSSFeedManager default. feeds := mgr getAllFeedsFrom: folder. dict := Dictionary new. feeds do: [:eachFeed | | matches | matches := eachFeed items select: [:eachItem | eachItem link ifNil: [false] ifNotNil: ['*blogspot*' match: eachItem link]]. dict at: eachFeed title put: (eachFeed items size -> matches size)]. ^dict
That resulted in an inspector that looks like this:

That's a useful view for scrolling through - let's cut things down and create a table that can be easily posted. I'll limit the table to feeds that have at least 10 bad results in them. First, I added a test to the previous script, such that only feeds passing my test get into that dictionary. I have 44 search feeds; 22 of them passed the bad results test. On to the html script:
stream := WriteStream on: (String new: 1000). stream nextPutAll: '<table border="1" cellpadding="3">'; cr. stream nextPutAll: '<tr>'; cr. stream nextPutAll: '<td><strong>Feed Title</strong></td>'. stream nextPutAll: '<td><strong>Total Items</strong></td>'. stream nextPutAll: '<td><strong>BlogSpot Items</strong></td>'. stream nextPutAll: '<td><strong>Splog Percentage</strong></td>'. stream nextPutAll: '</tr>'; cr. dict keysAndValuesDo: [:key :value | | total spam percent | stream nextPutAll: '<tr><td>'. stream nextPutAll: key, '</td>'. total := value key. spam := value value. stream nextPutAll: '<td>', total printString, '</td>'. stream nextPutAll: '<td>', spam printString, '</td>'. percent := ((spam/total) asFloat * 100) rounded. stream nextPutAll: '<td>', percent printString, '</td>'. stream nextPutAll: '</tr>'; cr]. stream nextPutAll: '</table>'; cr. ^stream contents
Running that produces the following output:
| Feed Title | Total Items | BlogSpot Items | Splog Percentage |
| IceRocket: "VA Smalltalk" | 80 | 10 | 13 |
| IceRocket: "Squeak Smalltalk" | 80 | 23 | 29 |
| BlogPulse: "Squeak Smalltalk" | 29 | 15 | 52 |
| IceRocket: BottomFeeder | 80 | 62 | 78 |
| BlogPulse: Cincom | 80 | 34 | 43 |
| BlogPulse: BottomFeeder | 80 | 13 | 16 |
| IceRocket: "Dolphin Smalltalk" | 80 | 16 | 20 |
| Feedster Smalltalk | 80 | 49 | 61 |
| Google Blog Search: BottomFeeder | 80 | 35 | 44 |
| Feedster: VisualWorks | 80 | 10 | 13 |
| BlogPulse: Smalltalk | 80 | 31 | 39 |
| IceRocket: Smalltalk | 80 | 38 | 48 |
| Feedster: Cincom | 80 | 28 | 35 |
| BlogPulse: Dolphin Smalltalk | 57 | 19 | 33 |
| BlogPulse: "Cincom Smalltalk" | 57 | 17 | 30 |
| IceRocket: Cincom | 80 | 55 | 69 |
| Technorati: BottomFeeder | 80 | 31 | 39 |
| PubSub: Smalltalk | 80 | 80 | 100 |
| BlogPulse: VisualWorks | 80 | 24 | 30 |
| Technorati: "James Robertson" | 80 | 46 | 58 |
| Feedster on: "James Robertson" | 80 | 27 | 34 |
| Technorati: Cincom | 80 | 44 | 55 |
Gives you an idea of the kind of spam attack that was running over the weekend, doesn't it?
Real Tech News has disturbing info on what's in your pillow:
Fungal contamination of bedding was first studied in 1936, but there have been no reports in the last seventy years. For this new study, which was published online today in the scientific journal Allergy, the team studied samples from ten pillows with between 1.5 and 20 years of regular use. Each pillow was found to contain a substantial fungal load, with four to 16 different species being identified per sample and even higher numbers found in synthetic pillows.
Sounds lovely :)
Tim Bray wants to move to an "internet stamp" system in order to eliminate spam. It's a nice idea, but it will never work. Why? Well, what do you do if a bunch of domains decide to offer internet stamps for free? Just knock them off the net? Yeah, that'll go over well. There's another problem too - it requires a long grace period followed by a cutoff date - after which older clients will just stop working. Yes, I can sure see that happening too. How do you plan to manage an enforced upgrade across every platform on the net?
There's an even simpler problem. Let's say the cost is a penny a transmission, as Tim posits. This assumes a robust micropayment architecture (which doesn't exist). It also assumes that at that cost, spamming is prohibitive.
Hmm - that's $10,000 to send a million messages. Based on the kinds of revenues that spammers are supposedly rolling in, I suspect that this will be less of a disincentive than Tim thinks. People pay astounding amounts of money to put 30 second spots on the superbowl - spending $100,000 to put 10 million spam messages out just doesn't sound that wild to me. Not to mention the enormous pressure on governments to make unsolicited mail legal once there's tax revenue to be gleaned from it. No doubt you've seen the huge efforts governments take to stop unsolicited snail mail?
Thanks, but no thanks. Take that solution and just bury it.
Tim Bray needs to set up a few search feeds:
Dave’s numbers suggest that there’s less there than meets the eye; that the numbers and reach of splogs are limited. It’s just that their automated content generation managed to cause them to fill up the ego feeds of a bunch of loudmouthed widely-read bloggers, who all screamed simultaneously.
The example I posted noted that searches for Smalltalk (as in, the programming language) got flooded. I would have to assume that Java searches were flooded the same way (or worse, given the larger number of potential readers). No, we aren't all interested only in ego searches. Some of us just want to see what's being said on topics of interest.
Jakob Nielsen has a list of dos and don'ts for blog authors. Some of them matter more than others, and some depend a lot on the context of your blogging. For instance - the "own your own domain name" one.
That depends on what you are trying to accomplish. Me? I'm evangelizing Cincom Smalltalk (and ranting about things in the industry that cross my view). Given the evangelism aspect, it makes sense more me to be blogging on a Cincom server - my goal is to build the Smalltalk community. How important the domain is depends a lot on your goals.
Another one of his suggestions popped at me as problematic as well:
Many weblog authors seem to think it's cool to write link anchors like: "some people think" or "there's more here and here." Remember one of the basics of the Web: Life is too short to click on an unknown. Tell people where they're going and what they'll find at the other end of the link.
You have to "go with the flow" of blogging on this one. An awful lot (most) of what is written on blogs is extremely temporal - it's very much based in the now. Which means that for the person reading about the latest kerfuffle (technical, political, whatever) - they get the context. It's very unlikely that anyone will care that deeply in a month, much less a few years.
Much of the rest of what he wrote is good stuff though - have a look, and see what you think. In this area, it's definitely a YMMV thing.
InfoWorld reports that the rise of the CIO is over - the job is being done in by the reporting requirements of Sarbanes-Oxley:
Because IT has a close relationship with all of a company's data stores, it provides everything the CFO wants to look at, including statutory reporting and analytical capabilities. At Merial, all senior IS directors now report to the CFO.
"Have information on a timely basis, with an audit trail, is what [Sarbanes-Oxley] required us to do. Everything must be traceable to the source," Lerner tells me. While IT is responsible for satisfying the needs for compliance, the CFO is the gatekeeper. So, in January, no more CIO.
I'm not sure what that means in the bigger picture - but it might be a good thing. I've seen an awful lot of projects that went on and on (long after they were obvious failures) solely because IT management couldn't stand up and deal with admitting it. With Sarb-Ox requirements and the CFO on the line, maybe there will be less. Or, human nature being what it is, maybe not :)
Now that it's been a couple of days, I thought I'd have a look at my search results again, and see how recent (more valid) results have replaced the earlier splog ridden stuff. I posted some code and a table showing the damage a few days ago - let's see what's happened since:
| Feed Title | Total Items | BlogSpot Items | Splog Percentage |
| IceRocket: "VA Smalltalk" | 80 | 10 | 13 |
| IceRocket: "Squeak Smalltalk" | 80 | 21 | 26 |
| BlogPulse: "Squeak Smalltalk" | 29 | 15 | 52 |
| IceRocket: BottomFeeder | 80 | 62 | 78 |
| BlogPulse: Cincom | 80 | 34 | 43 |
| BlogPulse: BottomFeeder | 80 | 14 | 18 |
| IceRocket: "Dolphin Smalltalk" | 80 | 15 | 19 |
| Feedster Smalltalk | 80 | 27 | 34 |
| Google Blog Search: BottomFeeder | 80 | 34 | 43 |
| Feedster: VisualWorks | 80 | 10 | 13 |
| BlogPulse: Smalltalk | 80 | 27 | 34 |
| Feedster: Cincom | 80 | 28 | 35 |
| BlogPulse: Dolphin Smalltalk | 57 | 19 | 33 |
| BlogPulse: "Cincom Smalltalk" | 57 | 17 | 30 |
| IceRocket: Cincom | 80 | 40 | 50 |
| Technorati: BottomFeeder | 80 | 30 | 38 |
| PubSub: Smalltalk | 80 | 74 | 93 |
| BlogPulse: VisualWorks | 80 | 24 | 30 |
| Technorati: "James Robertson" | 80 | 46 | 58 |
| Feedster on: "James Robertson" | 80 | 29 | 36 |
| Technorati: Cincom | 80 | 45 | 56 |
If you compare those numbers to the earlier ones, you'll see that they are trending down - the various engines have started responding to the problem. The results coming out of Feedster in particular are better - they seem to have done a pretty good job of weeding stuff out. Some of this, of course, is Google weeding out the splogs too. Also, those numbers are high due to the way BottomFeeder caches - all the old bad results are still setting there, marked as read (i.e., ignored by me, but still in my data set. Let's modify the original search so that I'm only looking at results that have arrived on Monday or today. I've also relaxed the number needed to show "badness" from 10 down to 2, given the smaller data set:
| folder mgr dict |
folder := RSS.RSSFeedViewer allInstances first feedTree selection.
mgr := RSS.RSSFeedManager default.
feeds := mgr getAllFeedsFrom: folder.
dict := Dictionary new.
cutoff := Timestamp readFrom: '10/17/05' readStream.
feeds do: [:eachFeed | | matches all |
all := eachFeed items select: [:eachItem | eachItem pubDateString >= cutoff].
matches := all select: [:eachItem |
eachItem link
ifNil: [false]
ifNotNil: [('*blogspot*' match: eachItem link)]].
matches size >= 2
ifTrue: [dict at: eachFeed displayTitle trimBlanks put: (all size -> matches size)]].
Simply adds a cutoff date of Monday at midnight. So what's been hammered since then?
| Feed Title | Total Items | BlogSpot Items | Splog Percentage |
| IceRocket: Smalltalk | 80 | 5 | 6 |
| PubSub: Smalltalk | 29 | 24 | 83 |
| IceRocket: Cincom | 37 | 6 | 16 |
| Feedster Smalltalk | 38 | 6 | 16 |
That's a much smaller amount of damage - although, it looks like PubSub's matching algorithm is particularly vulnerable to this sort of attack.
One of the things people ask about a lot is the VW process model. The simple answer is that the VM is single threaded, so all VW processes are managed at the Smalltalk level - i.e., they aren't OS level threads. You can create OS level threads, but only in the context of threading an external API call. A good example of this can be seen in the various database connects that we ship - you'll note that we ship threaded and non-threaded (i.e., blocking) versions.
So given that, what are Smalltalk level processes good for? Well, bear in mind that you (as the developer) have full control over their semantics. That means that an application deployed on Windows will run exactly like one deployed on Linux (or Mac, or Unix) - a VW process is a Smalltalk artifact, so it's not going to be unpredictable. Let me walk through a simple example, using the BottomFeeder update loop. I subscribe to 315 feeds at the moment, so when the update loop fires, I get 315 VW level processes doing HTTP queries. If those were all OS level threads, the system would fall to its knees in seconds - I'd have to use a thread pool. Incidentally, I implemented one as an option for Bf - but I digress. Here's the main update loop (somewhat simplified for space reasons):
feedsToUpdate do: [:aFeed | | updater delay | updater := [self updateFeed: aFeed shouldForce: shouldForce totalFeeds: numberOfFeeds]. self settings runThreadedUpdates ifTrue: [self runThreadedUpdateFor: aFeed updateBlock: updater] ifFalse: [updater value]. "other code here..." ].
If I have threading turned off (useful on slow connections, where I don't want the queries competing for bandwidth), I just iterate over the list. The interesting piece is in the threaded updates:
runThreadedUpdateFor: aFeed updateBlock: updater self settings shouldThrottleThreads ifTrue: [self runWithThrottling: updater for: aFeed] ifFalse: [self runWithoutThrottling: updater for: aFeed]
That checks another setting, which controls whether the app should use a thread pool or not. The "throttling" code implements a pool, the non-throttled code just keeps forking off threads. That's how I run Bf, and it works fine (with a fast connection). Drilling to the throttled code:
runWithThrottling: updater for: aFeed self updateCounter addProcess: updater atPriority: self settings getUpdateLoopPriority. addProcess: aBlock atPriority: aPriorityOrNil "add the process to the wait pool" self sem critical: [self waitingCollection add: aBlock->aPriorityOrNil]
That code simply adds the new process to a queue, which runs a limited number of processes at once. The non-throttled code?
runWithoutThrottling: updater for: aFeed | proc | proc := updater newProcess. self updateCounter addThread: proc url: aFeed url. proc priority: self settings getUpdateLoopPriority. proc resume
Now that demonstrates something useful about the level of control you have over a VW process. I'm setting the priority of the process (by default, it's in a range from 1-100, with 8 "named" levels). Then I'm resuming the process. A VW process is defined simply as a block (the snippet all the way at the top) which later gets forked off. In this example, I'm setting the priority and then resuming (forking) the process. I'm also holding a reference to the process, so that it can be killed (for instance, if you take BottomFeeder offline, the system goes ahead and whacks all the in progress threads in that loop, along with the update loop itself).
The priority levels I mentioned are used by the default process scheduler - which is written in Smalltalk. What does that mean? It means that you have full control over the way processes run in Smalltalk. The default model runs the highest priority process that is ready to run, but - at a given priority level - no process will preempt another of the same priority. In other words, it's not time-slicing. Say you wanted it to be? Well, that's simple - to timeslice a given set of processes, you simply have a higher level process manage them (which is what my throttle does to some extent). If you want to timeslice the entire system? Have a look at class ProcessorScheduler and change the way it manages things.
It's a nice system, and it gives you a very high level of control over how your system runs.
Interesting piece of news here - Ward Cunningham (father of the Wiki) has left MS to be an Eclipse evangelist:
Microsoft Corp. has lost one of its high-profile hires to an open-source consortium. Mike Milinkovich, executive director of the Eclipse Foundation, announced on Monday that Ward Cunningham is leaving Microsoft to join the staff of the open-source tool consortium. Cunningham's new title is Director of Committer Community Development. Cunningham, the father of the Wiki concept, joined Microsoft about two years ago. At Microsoft, he was not involved directly in social-networking-software development.
That last bit is interesting - what did Microsoft want Ward to do, if they weren't going to have him work in the social software world?
Update: Dave Buck noticed this yesterday. Have a look at the quote Dave pulls - that's a pretty low level of excitement for an evangelist, IMHO.
Ted Neward explains why he doesn't like wide open dynamic support in a language:
First, the technical: dynamic languages may choose to expose moremeta-control over the language, but there's nothing inherent in the dynamic language that requires it, nor is there anything in a static language that prevents it. Languages/tools like Shigeru Chiba's OpenC++or Javassist, or Michiaki Tatsubori's OpenJavaclearly demonstrates that we can have a great deal of flexibility in how the language looks without losing the benefits of statically-typed environments. So to attribute this meta-linguistic capability exclusively to dynamic languages is a fallacy.
Secondly is the cultural issue: is the idea of granting meta-linguistic power (known as meta-object protocol, or MOP) to a language a good thing? Stu asserts that it is: "My concern is who controls the abstractions. Developer-oriented languages (like Scheme) give a lot of control (and responsibility) to developers. Vendor-oriented languages (like Java) leave that control more firmly in the hands of the vendor." So in whose hands are these abilities to change the language best placed?
*deep breath* I don't trust developers. There, I've said it.
Well, I'll take the contrary view (what a shocker!) - I don't trust the vendors. And I say that as the Product Manager for Cincom Smalltalk. When a vendor ships you a set of tools, you get the viewpoint of their developers as to how things ought to work. If that set of tools isn't malleable, you're just stuck. Hit a wall because the library isn't suitable for your needs? Too bad, you now have to argue with the vendor. Bearing in mind that you might not win.
Think that it'll be easy in the "obvious" cases? Heck, I'm the flipping Product Manager here, and I allegedly set direction - do you think the engineers buy everything I raise as a needed core library change (I've done a small number of them for BottomFeeder)? Heck no - how far do you think you'll get with Sun or MS?
The alternative is what you see in lots of Java projects - one more wrapper around the (insert your favorite example here, like String) class, because Sun decided to seal that one. It's just more pickaxe and shovel work to plow through, because it's simpler to not trust the developers. As opposed to those *cough* godlike *cough* library developers.
I'd be a whole lot more interested in Gillmor's blathering about attention if the site he flogs actually described what it does. It lets loose with a lot of buzzwords about my rights, and then asks me to join something. Umm, yeah - I've gotten the same pitch in gosh knows how many spams and junk mails too, Steve.
Here's a tip - you want this to go anywhere? Have the page that allegedly describes how great this stuff is actually tell me what the heck it is. In the meantime, I'll just use links. They make sense, and I don't need a set of buzzword bingo cards from a website to figure them out.
Bonus clue - I don't need to hand some website with less than no information on it my email address, name, and url to link.
One of the things that an aggregator allows you to do is keep up with a lot more information flow. As I said earlier today, I subscribe to 315 different feeds (44 of those are search feeds). I figured it might be interesting to see how much new content there is in a day from the non-media, non-search (i.e., mostly bloggers) feeds that I track. So, I opened up a workspace in BottomFeeder and started hacking out a script:
rejects := #('*feedster*' '*blogpulse*' '*google*' '*yahoo*' '*amazon*' '*icerocket*' '*rocketnews*' '*pubsub*' '*blogniscient*' '*digg*' '*sans*' '*infoworld*' '*computerworld*' '*linux*' '*slashdot*' '*wired*' '*rss.com*' '*internetnews*' '*comics*' '*file://*' '*technorati*' '*techrepublic*' '*meetup*' '*memeorand*' '*espn*' '*cnn*' '*extreme*' '*wbal*').
today := Date today asTimestamp.
basicFeeds := RSSFeedManager default getAllMyFeeds reject: [:each |
(rejects detect: [:each1 | each1 match: each url] ifNone: [nil]) notNil].
counts := OrderedCollection new.
basicFeeds do: [:eachFeed | | todays |
todays := eachFeed items select: [:each | each pubDateString >= today].
todays notEmpty
ifTrue: [counts add: eachFeed displayTitle -> (todays size)]].
sorted := counts asSortedCollection: [:a :b | a value >= b value].
It's a pretty simple script - I grab all the feeds, filter out the ones that are either media or search related, and then see which ones have content today. Then I slam the results into a collection, sort by frequency, and do an inspect-it on the results. Unlike those *cough* advanced *cough* languages in the mainstream, Smalltalk lets me do this at runtime, in the running application. Kind of cool :) Anyway, I wrote a quick script to slap that stuff in an HTML table:
| Feed | Posts |
| The Corner | 80 |
| MARS Activity | 35 |
| Daily Kos | 31 |
| Public Store | 28 |
| PCWorld.com - Latest News Stories | 26 |
| Bob Congdon | 25 |
| Sam Ruby's Comments | 22 |
| The Doc Searls Weblog | 22 |
| Taegan Goddard's Political Wire | 21 |
| ongoing | 20 |
| Eschaton | 20 |
| Samizdata.net | 17 |
| Lambda the Ultimate - Programming Languages Weblog | 17 |
| VodkaPundit | 16 |
| Cook Computing | 16 |
| Instapundit.com | 16 |
| Microsoft Watch from Mary Jo Foley | 15 |
| RSS News by CodingTheWeb.com | 15 |
| Radio Free Blogistan | 15 |
| Philip Greenspun Weblog | 14 |
| Dvorak | 14 |
| Web Things, by Mark Baker | 14 |
| Mark Bernstein | 13 |
| National Review Online | 11 |
| Exploration Through Example | 10 |
| lesscode.org | 10 |
| PragDave | 10 |
| Squeak People | 10 |
| MemoRanda | 10 |
| TalkLeft: The Politics of Crime | 8 |
| Little Green Footballs | 8 |
| N=1: Population of One | 8 |
| Sci Fi Wire | 8 |
| Media Blog | 8 |
| Sjoerd Visscher's weblog | 8 |
| Glenn Vanderburg: Blog | 8 |
| cst | 7 |
| Power Line | 6 |
| Scripting News | 6 |
| java.net Weblogs | 5 |
| Michelle Malkin | 5 |
| Sam Ruby | 5 |
| Science @ NASA | 5 |
| CincomSmalltalkWiki | 4 |
| Micro Persuasion | 4 |
| Traffic | 3 |
| cst comments | 3 |
| The Ornery American | 3 |
| The Indepundit | 3 |
| Don Park's Daily Habit | 3 |
| Cafe au Lait Java News and Resources | 3 |
| Larkware News | 2 |
| Travis Griggs - Blog | 2 |
| evhead | 2 |
| Dare Obasanjo aka Carnage4Life | 2 |
| Hugh Hewitt | 2 |
| Captain's Quarters | 2 |
| Joho the Blog | 2 |
| Corante Blog | 2 |
| Scobleizer - Microsoft Geek Blogger | 2 |
| Derek's Rantings and Musings | 2 |
| Alice Hill's Real Tech News - Independent Tech | 2 |
| Daypop Search - BottomFeeder | 2 |
| Software (Management) Process Improvement | 1 |
| Joi Ito's Web | 1 |
| Mark Watson's opinions on Java, AI, semantic web, and politics | 1 |
| d2r | 1 |
| planet squeak | 1 |
| Chris Pirillo | 1 |
| The Fishbowl | 1 |
| Rob Fahrni, at the core. | 1 |
| The Blog Ride | 1 |
| Windley's Enterprise Computing Weblog | 1 |
| Better Living Through Software | 1 |
| Steve Shu's Blog | 1 |
| Industry Analyst Reporter - Applications and Software News | 1 |
| Workbench | 1 |
| Matthew Yglesias | 1 |
| WCBS 880: Yankees on WCBS | 1 |
| cut on the bias | 1 |
| Austin Bay Blog | 1 |
| The Belmont Club | 1 |
| PVRblog | 1 |
| ScrappleFace | 1 |
| The Doctor is in | 1 |
| ARs closed Activity | 1 |
| Sam Gentile's Blog | 1 |
| Panopticon Central | 1 |
Now, I didn't get all the non-blogs out, but that's good enough for now - it's down to 89 feeds that way. The MARS one warrants some explanation - it's the feed off our internal bug tracking system, and we are approaching full code freeze for the next release - so activity is high. Other than that, the real outliers (i.e., lots of posts in a day) are group political blogs. Some of the high numbers are also some kind of server reset of the feed, not actual new content. That's still a problem that can fool an aggregator - especially when the feed in question doesn't have ID's for the items.
Anyway - looking at "real" results, it looks like a dozen new posts is a lot - most people are well under that. In fact, if I filter the list to those who posted 10 or fewer times so far today, I get down to 63 feeds. It turns out that the 7 (8 after this one goes up) posts today put me up near the top of that list. In fact, 23 of the feeds only have one new item so far today.
So - if you skim the high volume news/search feeds, the posts on single author blogs aren't that hard to keep up with. At least not if you use an aggregator :)
This ComputerWorld story is mostly about Sun's hopes in the software business - which can mostly be summed up by: "We give our software away; why the heck can't we make money that way?"
Well, to add to that exciting suite of revenue makers, Sun is eyeing PostgreSQL:
"We're not going to OEM Microsoft but we are looking at PostgreSQL right now," he said, adding that over time the database will become integrated into the operating system.
That's Loiacono, VP of their software business. So does PostgreSQL stay open source? Does it stay cross platform? Is Sun just going to bundle it, or buy it? This article seems unclear.
I'm sure most of you have seen this Gene Spafford quote, but I just ran across it, and it cracked me up:
Secure web servers are the equivalent of heavy armored cars. The problem is, they are being used to transfer rolls of coins and checks written in crayon by people on park benches to merchants doing business in cardboard boxes from beneath highway bridges. Further, the roads are subject to random detours, anyone with a screwdriver can control the traffic lights, and there are no police.
There's other good stuff here.
Better Bad News has a hilarious take on splogs. They don't call it that - they're aiming wider :)
Joi Ito has an early look at an AC Nielson study on online shopping around the world - and demonstrates that it differs around the world:
The US is way behind Europe in the amount of online shopping (ranking 11 worldwide), perhaps because mall shopping is so much easier than shopping in a European city. This encourages Europeans to shop online.
I've been surprised by how hard it is to buy things in Europe after 5 pm, that's for sure. Everything closes early - it's not at all like the US, where I expect to be able to get milk (or anything else) at midnight or later. I guess this study shows that it has an impact on how people shop online.
Anyway, interesting stuff.
Ryan brings up the Wikipedia quality issue that's been buzzing around lately, and runs smack into the real problem - after noting, via Dave Winer, that everyone has an equal voice on the Wiki, we get to this as the solution:
Identify people who have expertise or knowledge on certain subjects
That's harder than you might think - and it all depends on the subject. I find that Wikipedia is pretty good on historical subjects (at least older ones), and that's because any controversy that may have existed on the subject has passed. For instance - look up Julius Caesar - the history reveals that there have been a number of reversions lately, but the general information looks pretty good - the damage on that page is the garden variety "I'm excited by curse words" sort of damage.
Now have a look at something more recent, and more contentious - the 2000 US Presidential election. Go browse the blogosphere if you think that there's anything resembling consensus on how that went down. I can't see there being a fully objective view of something as controversial as that election for a long time - it wasn't until deep into the 20th century that the 1876 election was viewed with any objectivity, for instance.
So back to the expertise question - how does a "real" encyclopedia deal with this problem any better than Wikipedia does? Take any controversial topic for which varying interpretations exist (i.e., nearly any historical event that happened within the last 100 years) - where do you find experts who have "unassailable knowledge" of some event? The bottom line is, you don't. Let's take a subject I've read a fair bit about recently - WWI. It's long enough ago now that some level of objectivity is creeping in - but it's still colored by subsequent events (WWII, the Cold War) - enough to generate controversy. What's definitive?
And that's just five books I've read on the subject - five books with very different discussions of how (and why) the war was fought. Let's take the encyclopedia up now - how does the entry on WWI address the war? How does it explain the hows and whys? I'll tell you how - it uses the (then current) academic consensus. Is that "correct" in any abstract sense? Who knows? It might be - or it might not be. The reality is, even WWI is still too controversial for there to be a reliable "consensus" view. Which means that the entry - whether it's in printed copy or bits - is just going to be some compromise view.
Exactly how does that differ for Wikipedia and any other work? It doesn't. The reality is, having "anyone" be able to edit doesn't mean that "everyone" will. Most people don't care deeply about any particular subject - the ones with an interest (and, of course, the vandals) will be the ones who show up. With the printed encyclopedia, anyone who's views fall outside the current academic consensus will just get cut out immediately. With Wikipedia, they have a chance to get their take peer reviewed and commented on.
Which leads me to the opposite view from Winer, and Ryan, and most other people - I'll take the Wikipedia approach over the standard. It's far more likely to allow a larger set of views fight it out.
You would think that an outfit as large as Google would do a trademark search before rolling out a service:
A trademark dispute has forced Google to re-brand its Gmail web mail service in the UK. Existing users get to retain their Gmail address (at least for now) but from Wednesday onwards new UK users will be given a Googlemail email address instead.
UK-based financial services firm Independent International Investment Research (IIIR) said its subsidiary ProNet Analytics has been using the Gmail name for a web-mail application since the middle of 2002, two years before Google began offering Gmail accounts to consumers. The email service offered by ProNet, by contrast, is used mainly by investors in currency derivatives.
I suspect that the problem here is a US-Centric one - I bet they looked in the US, saw no problems, and went ahead with a world-wide rollout. This is going to be a big problem for companies over the next little while.
There's been a lot of hype recently about Sun and Google coming up with a web based replacement for Office. I've been quietly skeptical about this one for awhile, but this article says everything I wanted to say about it.
The bottom line - the network isn't reliable enough for day to day applications. Heck, that's why I use POP access to gmail - I don't want to have to rely on network connectivity to look at mail. I take planes and trains often enough that being completely cut off from my mail for hours would be a real problem. I don't use web based aggregators for the same reason - I like having offline access (and Comcast makes sure that I have regular, short term offline experiences even while I'm in my office).
The last thing I want is a graphical VT-100 - and I don't need people explaining that reduced functionality and the danger of a complete loss of service is progress, either.
Jon Udell posted on something I saw, but mostly skipped over a couple of days ago - this "LifeHacker" story in the New York Times. Truth be told, since the Times put the TimesSelect thing in place, I've paid a lot less attention to them - I assumed this story was behind that, and moved along (yet another way that the Times has marginalized themselves, but I digress).
Anyway - the heart of the story is this snippet here:
Lots of people complain that office multitasking drives them nuts. But Mark is a scientist of "human-computer interactions" who studies how high-tech devices affect our behavior, so she was able to do more than complain: she set out to measure precisely how nuts we've all become. Beginning in 2004, she persuaded two West Coast high-tech firms to let her study their cubicle dwellers as they surfed the chaos of modern office life. One of her grad students, Victor Gonzalez, sat looking over the shoulder of various employees all day long, for a total of more than 1,000 hours. He noted how many times the employees were interrupted and how long each employee was able to work on any individual task.
When Mark crunched the data, a picture of 21st-century office work emerged that was, she says, "far worse than I could ever have imagined." Each employee spent only 11 minutes on any given project before being interrupted and whisked off to do something else. What's more, each 11-minute project was itself fragmented into even shorter three-minute tasks, like answering e-mail messages, reading a Web page or working on a spreadsheet. And each time a worker was distracted from a task, it would take, on average, 25 minutes to return to that task. To perform an office job today, it seems, your attention must skip like a stone across water all day long, touching down only periodically.
That's not going to be a productive way to work, regardless of your profession. Jon is hoping for tool support to fix this, but I think that's fundamentally the wrong way to go - and online "status" pages that let other people know how busy you are aren't going to cut it either - people will just blow right past that, deciding that their stuff is more important - how many phone numbers do you run through when you get voice mail for someone on a given line? Do you just stop there? I din't think so :)
Now, I work out of a home office, so the problem of people walking up to me goes away. That leaves the phone (land line and mobile), email, news aggregator, and IM. All of these are easy to ignore, if I want to. The only real way to solve this problem is personal fortitude. You want to focus on a task? Fine - close the door, mount a "do not disturb" sign, and get to work. Technology isn't going to save you here.
This seems a little harsh, but boy, was it funny. Via ArcterJournal.
Jeff Jarvis commented on the splogs this weekend (like everyone else), but there are a couple of interesting comments if you scroll down - have a look at what Steven DenBeste said:
Ultimately there isn’t any permanent solution to this kind of thing. Any system which permits anyone to create readable material on anyone else’s system will be abused by spammers.
The only real solution is to assume that a certain percentage of people out there are hostile, and to design accordingly. Automatic trackback, for instance, was always a terrible idea because it assumed universal good faith.
And then below that, this:
By the way, has it occurred to you that it is economically to Google’s advantage to let the current situation persist? As long as Google can keep its own search results clean, then the spam blogs will make everyone else’s search engines useless and thus drive traffic to Google.
Why would Google want to change the situation? Certainly it would be both illegal and immoral for Google to actively work to pollute the search results of its competitors, but I don’t think that benign neglect of spammer abuse of Blogspot is actionable, and it serves the same purpose. Certainly if that’s what Google is thinking, then it’s slimy. But not illegal, and not actionable. And it’s difficult to see why Google would want to expend any significant effort to try to fix the situation.
Certainly food for thought. Also, make sure to walk through to Elliot's post on splog prevalence on BlogSpot - apparently, it's nearly a third (I'd really love to see historical tracking on that!). Kind of blows a hole in Evan Williams' breezy 1% nonsense...
I suppose the Lisp folks see the same thing, just a bit slower. The evolution of development languages is clearly in the direction of what Smalltalk is, with the great mass of developers being dragged kicking and screaming, begging not to have C syntax taken away from them. Consider the progress:
From the perspective of a Smalltalker, it looks like a grudging acceptance that we were right all along, but the foot dragging and wailing can still be heard. Kind of like a toddler being put to bed...
David Hasselhoff told Australia's Rove Live TV show that he's acquired the film rights to his old Knight Rider TV series and still plans to turn it into a feature film, according to the Moviehole.net Web site.
Spam hasn't been a huge problem on the CST blogs - for one thing, we have obscurity (my own server) on our side, and, for another, the various blocking schemes I've put in place seem to work pretty well. Even so, it's useful to be able to delete multiple comments at once - so that's a feature of the web admin stuff now. If you login, and go to the post editing page, you can select a post, see all the comments, and select which ones (if any) to delete.
One of the things I bring up here is why overriding base library behavior is often useful - I thought it might be a good idea to give an example of that, and show how you manage it in VisualWorks.
Here's a a view of the Browser, with a method overridden:

I've shrunken the image for space reasons, but in the top right you'll see that the method is red. That indicates that I've overriden a method owned by a different package (WithStyle, in this case) in the posting tool's package. The override is this snippet of code I added:
(value isNil or: [value isEmpty]) ifTrue: [value := ParagraphEditor currentSelection].
In the method above, "value" holds a string that will be the default choice in a dialog box that is popped up for data entry. In this case, it's popped up when the user wants to insert an image into the post. I figured it would be nice if the tool remembered the last image that got uploaded, and made that the default choice. As it happened, the WS method was usually leaving that choice blank (thus, requiring me to type it all in).
I spoke to Michael, and he told me that better pluggability there would be a good thing. In the meantime though, I'm stuck - unless I override. So I did that, and you see the way the browser displays it - in a way that is easy to pick out. I can also version this override off separate from the WS code (thus, not perturbing their codebase). Later, after they've addressed this, I simply remove the override (a menu pick in the browser) and adjust my code to deal with their pluggability.
Simple, and it gets the job done - and in the interim - unlike the situation with Java, or C# - I don't have to either wait for the vendor, or write an entire replacement library/wrapper to get around the limitation. Two lines added to an override, and I'm done.
Of course, some people "don't trust developers", so they are happy wearing the chains the vendors hand them...
Gosh forbid I should be able to execute a search and find a book that I might want to buy - no, publishers and authors seem to like the current system, where finding a reference requires a trip to the library (which most people won't make). Has it occurred to these fools that better searching will lead to more sales? Or do they take the same stupid pills that the RIAA and MPAA are on?
Via Scoble, I see that the awful idea that is the Office 12 Ribbon is already spreading. Great - I really wanted more wasted screen space everywhere. The Office 12 team that came up with the Ribbon really, really needs to be flogged.
Better yet, they can come to Columbia, and join the local highway department. Those guys are filled with bad usability ideas for the roads, so the Office 12 folks will fit right in.
PR Differently is pretty sure that his prediction about the demise of large parts of print media is coming along. I'm not about to argue the other side of that one :)
Jonathan Schwartz makes the point that rewrites done just for the sake of a new language/toolkit don't make sense - something the entire industry collectively forgot about a decade ago:
Before I receive 2,000 email critiques, you should know my roots are in desktop software. So lest you think I'm coming at this from the perspective of a knuckle dragging big iron computer guy, that's not me.
As a software guy, here's a simple (though often irritating) rule behind user oriented software: The language in which a product is written has nothing to do with the value it conveys.Coming from the company that produced Java technology, that probably sounds a little odd. But it's a simple truth, especially when it comes to users: if the app's no good, it's no good, even if it's implemented in Java. Or PHP. Or Rails.
...
Because rewriting an app simply to use a new toolkit isn't creating value for consumers. Creating an application or service that delivers unique value is what captures users. And the internet gave some developers a tremendous opportunity to deliver unique value - by radically simplifying basic networking, enabling connectivity and community on a truly global scale.
Couldn't have said it better myself.
Dare says that "Web 2.0" is meaningless hype, and quotes Joel Spolsky, who has a good post out on the same topic. Here's Dare's summary:
I feel the same way. I am interested in discussions on the Web as a platform and even folksonomies (not tagging) but the marketplace of ideas has been polluted by all this "Web 2.0" garbage. Once again, I've flipped the bozo bit on Web 2.0. Like Joel, you won't see any use of the term on my blog or in items I link to from now on.
I think they are both dead on target with this. There's a lot of buzzword bingo going on, and it looks like people are starting to spend money stupidly again.
Steve Rubelis down on Flock (something I haven't seen), but that's not really the point of this post. Here's the snippet from Steve that I want to look at:
Last night I tried out Flock, a new Mozilla-based browser that's getting a ton of buzz. The press is chiming in here too. Originally just a handful of people were invited to try Flock, which is in developer preview. Unfortunately, Flock installers quickly spread around town and the company released it out to everyone to try.
That last bit is important - Flock installers quickly spread around town. The error made by the Flock people is in the assumption that you can make a limited release available on the web. If you have something that well read people are going to look at - and comment on - then it will stop being a limited release within a few minutes. Which might be a problem, depending on whether your stuff was really ready for general release.
Bottom line - there's pretty much no such thing as a limited release anymore.
So is this report from Boing Boing some limited ISP problem, a large spam surge, a viral attack of some sort, what? I haven't seen any slowdown myself, so it's rather abstract to me at this point.
Every so often, I make a mistake in the deployment of code from my test server to the production server (the Smalltalk image that runs this and the other blogs here). In those cases, I go back to the methods I updated and look to see what's different. Now, I'd rather not take the server down for this kind of thing, so instead I do something like this and load the patched method into the system:
[self codeThatMightBeWrongHere] on: MessageNotUnderstood do: [:ex | Transcript show: ex errorString; cr].
Then I do something with the server that will exercise the modified code, and watch the Transcript (scrolling to a file) to see what happens. Once I figure that out, I restore the original code, do more testing on the test server, and deploy the needed fix. All without taking the server down. It's one of the cooler aspects of having a full development system available as your deployed server.
I wouldn't have thought it was possible, but I think I've spotted a well of stupidity that's actually deeper than the one Darl McBride lives in:
Charlotte, N.C.-based Scientigo owns two patents (No. 5,842,213 and No. 6,393,426) covering the transfer of "data in neutral forms." These patents, one of which was applied for in 1997, are infringed upon by the data-formatting standard XML, Scientigo executives assert.
Scientigo intends to "monetize" this intellectual property, Scientigo CEO Doyal Bryant said this week.
And I thought the boys from Eolas were ambitious.
Time for my weekly look at the logs - the BottomFeeder downloads dropped back from the stratospheric levels they reached last week, to a still respectable 403 per day:
| Platform | BottomFeeder Downloads |
| HPUX | 672 |
| Windows | 637 |
| Mac 8/9 | 399 |
| Sources | 308 |
| Mac X | 216 |
| Linux x86 | 182 |
| Update | 175 |
| CE ARM | 122 |
| Windows98/ME | 29 |
| Linux Sparc | 28 |
| Solaris | 20 |
| AIX | 16 |
| Linux PPC | 7 |
| SGI | 6 |
| ADUX | 3 |
| Source Script | 3 |
Those HP download numbers are a source of constant amazement to me. Off to the html page accesses, where it looks like IE is staging a comeback:
| Tool | Percentage of Accesses |
| Internet Explorer | 45.8% |
| Mozilla | 39.9% |
| Other | 7.7% |
| MSN Bot | 2.3% |
| Google Bot | 2.3% |
| Java | 2% |
I guess the uptick in readership is driving me more toward the average browser usage, which is still heavily IE. On to the RSS pages:
| Tool | Percentage of Accesses |
| Mozilla | 27% |
| BottomFeeder | 12.7% |
| Other | 12.6% |
| Net News Wire | 10.3% |
| BlogSearch | 4.8% |
| Safari RSS | 4.4% |
| Planet Smalltalk | 3.5% |
| NewsGator | 3.2% |
| Magpie | 2.8% |
| RSSReader | 2.5% |
| Internet Explorer | 2.4% |
| SharpReader | 2.3% |
| BlogLines | 2.1% |
| Feed Reader | 1.5% |
| Feed Demon | 1.5% |
| RSS Bandit | 1.2% |
| Liferea | 1.2% |
| Google Bot | 1% |
| Jakarta | 1% |
| JetBrains | 1% |
| News Fire | 1% |
Looks like the RSS aware portion of the audience is still very Mozilla and Mac centric - with the rest of my audience spread across a very diverse range of tools. On the other hand, if you look at the tools owned by NewsGator now, they have 15% of my audience.