web
October 9, 2005 8:51:14.364
Robert has a rather long response to my "what's the context" post. I left a comment over there, but I figured I'd just toss that here:
"Here's a clue. If a human can find a few manufacturers within a few minutes, then a search engine should be able to find them even faster. After all, humans design search engines and the algorithm I used to find the Sony, Panasonic, and Samsung site can be replicated pretty easily."
The point of my post was the lack of context in a one word search. You might want manufacturers; someone else might want a definition; a third person might wanta set of reviews. Short of you providing more context, I have no idea why you think a search engine should be able to figure out what you want. Now, that context could come from more terms - it might come from the engine tracking your searches over time and learning about what you like.
I'm not questioning your motives; I am questioning the assumptions you make. The assumption looks an awful lot like "I want this kind of result; everyone will want the same thing; therefore (insert engine provider here) should provide that result no matter what I enter".
That seems an awful lot like the blind spot Dave Winer has about how aggregators should work. He likes a "river of news" sort of view, and he translates that personal preference into "that's how all tools should work". Which is how your post looked to me.
Given a one word search, how is the back end (without some kind of historical tracking to give it context) supposed to know what sort of results you want from an ambiguous entry?
I don't have any notion that "search is done", or that the engines have results that are as good as they are going to get. What I am saying is that context is a critical component of any answer - machine generated or human generated. How many times have you answered the wrong question? I've done it a lot, and it happens to me because of a simple failing on my part - I have a tendency to assume the question I've been asked before I hear the whole thing, and I then respond to the assumption. In my experience, lots of people do that.
That failing is essentially a lack of context - in that case, it's not that context is not provided, but that I'm not hearing it. Engines have a similar problem though. You sit down to do a search with a whole set of related information (context) in your head, but you type one word (or acronym) into the search engine. It comes back with results, and you get upset that they don't match your expectations. That's because the engine didn't have access to all that context you have, but didn't (and in many cases could not have) provided.
Basing assumptions on Ray Kurzweil's assumptions about human/computer merging are too much of a leap for me, but Robert goes right out and makes it:
Another fairly common argument is to ask me to go around asking other human beings "HDTV?" and see what they say. That's lame. We don't use search engines the same way we use friends. And, anyway, if I went into an HDTV store, I'd ask "do you have a list of HDTV manufacturers?" and they'd be able to provide me a list right away. At my camera store I had a list of all the camera brands. In fact, I often let customers see my wholesale catalogs. That info is out there, just not available in search engines. Reading CNET tonight I see that Google's CEO said it might be 300 years before Google indexes all the world's information and makes it searchable. Now THAT'S what I'm talking about! Although, 300 years? I doubt it. I suspect we'll get to 99.9 % within 30 years. Eric Schmidt should read Ray Kurzweil's new book, the Singularity is Near, for why. By the way, MSN gives a better result than Google when you search on the Singularity is Near. Here's that result on Google and here's the same search on MSN.
Comparing the results from a (presumably knowledgeable) sales clerk who can assume context based on full sentences and your presence in the store isn't impressive either. Try just walking up to him and saying "HDTV?". Most likely, he'll point to the section of the store that has them - not to a catalog list of providers.
Share
BottomFeeder
October 9, 2005 10:52:10.084
With some help from Anthony Lander, I just found and fixed a startup bug in BottomFeeder. If you change the look and feel (not common, I guess, since this hasn't been reported before), Bf starts up looking like it's lost all the feeds, and then isn't truly functional. The problem was a timing bug that cropped up because of the new tabbing code. I have an update on the site (meaning, look for updates - there's not a new full build out).
Share
sports
October 9, 2005 12:26:11.343
It's time to find out what the 2005 Yankees are made of. Down 2 games to 1, they win today or they stay home. The Red Sox were clearly outclassed by the White Sox - less so than the Padres, who were clearly not in the same league as the Cards. Thus far, the Yanks haven't been outclassed, but they have been outplayed and out-hustled. Tonight we'll see what the team has up their sleeves.
Share
events
October 9, 2005 19:22:43.953
Blaine Buxton will be presenting Seaside at the Omaha STUG on the 11th:
This month I will be presenting Seaside, Avi Bryant and company's brilliant continuation-based web framework. People have been requesting it again so I thought it was time to do it again. It should be a lot of fun as always!
Here's all of the details:
When: October 11, 2005, 7pm - 9pm
Where: Offices of Northern Natural Gas
1111 S 103rd Street
Omaha Nebraska 68154
Office is at 103rd & Pacific. Guests can park in the Northern visitors parking area back of building, or across the street at the mall. Enter in front door, we'll greet you at the door at 7:00pm. If you arrive a bit later, just tell the guard at the reception desk you're here for the Smalltalk user meeting in the 1st floor training room.
Sounds like a good time.
Share
sports
October 9, 2005 19:27:49.145
All I can say is wow - 18 innings and a walk off home run ends it with a Houston victory. Good thing for Houston, too - if Clemens ran into trouble, they had no one left (except Pettite, who was not rested). I "only" watched the last 7 innings, but that felt like a full game. Now that's baseball.
Share
sports
October 9, 2005 23:20:22.487
Boy, that was one tight game. Chacon pitched very well, and so did Lackey for the Angels - especially given that he was working on 3 days rest. The plan had been for Jared Washburn to pitch, but got sick - apparently very sick.
The Yankees pulled it out by coming up with 3 runs between the 7th and 8th, without fireworks. The amazing thing was that the Angels pen gave up the lead, while the Yankees pen held it - Leiter actually came in and did the job before Rivera closed it out.
One more tomorrow night.
Share
events
October 10, 2005 7:50:58.398
I'll be in London for the SPA 2006 conference this spring - I'm speaking on Tuesday as part of a "hot topics" slot. This is a fun conference - I've enjoyed it every other time I've attended. This year is going to be tiring though - I have a family event in LA on March 25th, and have to fly straight off to London. Ugh.
Share
music
October 10, 2005 8:25:35.059
Newsweek has way too much sympathy for the clowns who run the music industry - have a look at this:
The industry doesn't want to repeat a history of undervaluing itself. In the days when its business plan was simply to promote and peddle music, it footed the bill for producing videos, and initially was only too happy to give them to MTV to help build buzz. For the Viacom-owned network, the videos drew huge audiences, building MTV into a multibillion-dollar asset. "We watched people make fortunes and create valuable assets off of our music," says a former top exec who feared risking his role, if he were identified by name, as an industry consultant.
Wow, what a self absorbed jerk. Back when music video came out, CD's were selling for prices between $15 and $21. Bear in mind that those dollars were worth more then, too. The CD's cost how much to create? Virtually nothing. How much of that revenue went back to the actual artists? Not a lot. So this executive can go eat a handful of CD's and choke on them for all I care.
And now, the same hacks are upset over their share of the profits from things like iTunes. At present, they get paid 60-70 cents of the 99 cent download price - and they think prices aren't set properly - they want new stuff priced higher. Perhaps they haven't been looking at behavior in the buyer space recently - even at the 99 cent price, lots of people are bypassing the store and ripping music for free. If the execs think that raising prices will solve that problem, they have another think coming. If anything, prices need to drop enough to make that bypassing not worthwhile.
Share
web
October 10, 2005 12:17:24.141
Well. We had a discussion on the merits of OPML awhile back - Scoble's theory was that it was "good enough" and us tool providers should just suck it up. Let me explain what kind of problem that creates.
OPML is the default import/export format for most aggregators. However, the OPML *cough* spec *cough* doesn't actually specify anything for that purpose - which has resulted in various tool providers making various things up, and coalescing (more or less) around a de-facto standard. Except when there are variances.
I had a new user trying out BottomFeeder, and they took their export from another tool, and ran it through the import. The following flicker feed gave them trouble - the troublesome part is below:
http://urlHere?id=IDWentHere@passwordWentHere&format=atom_03
Looks normal enough, right? Well, the importer in Bf was mangling that url. A BottomFeeder bug, you say? Well, not exactly - as it happens, I had to add special handling for the @ character about a year ago, when I was informed that an aggregator (Liferea) embedded usernames and passwords into the url that way - but they weren't actually part of the url. For feeds that use HTTP auth (or digest auth), that tool slapped them onto the url, expecting the tool to figure that out and cache them. That worked fine, until I stumbled across this flickr usage, which embeds that sort of ID information right in the url, and expects to have it stay there.
Goodie - now I have code that has to look specifically for those two cases and differentiate them. This is why OPML sucks, and the author of the *cough* spec *cough* should be taken out behind the woodshed.
Share
humor
October 10, 2005 15:30:18.062
Share
sports
October 10, 2005 23:52:14.882
Yes, there was a blown call that probably cost the Yankees some runs - the home plate umpire made a horrible call on Cano after the strikeout. That wasn't what cost the game though. The misplay by Sheffield and Crosby that gave up 2 runs - that was the game. The complete lack of hitting with men on base? That was the game too. Give the Angels credit - the capitalized on the mistakes. I don't give them much of a chance against Chicago - the White Sox are rested, and I think they have the better team. But the Angels were better than the Yankees this year, that's for sure.
Share
web
October 11, 2005 0:01:36.600
Looks like Chris Pirillo's gada.be would be an interesting site to play with, but it's not responding now. Looks like he needed more server oomph behind a service that Scoble and Winer (and everyone else) immediately advertised and promoted.
Share
law
October 11, 2005 2:10:01.952
I think it's time to call out this patent nonsense. Have a look at this mess by Apple and the following people, who's names are on a patent that they all know full well is bogus:
- Richard Williamson
- Daniel Wilhite
- Jack Greenfield
- Linus Upson
Apparently, they were just granted this patent (filed in 2002) for the brilliant innovation of the proxy object. Well heck - I think there are a few instances of prior art. Just taking one I can rattle off the top of my head, let me fire up VisualWorks 2.5, released in 1995. Well - I see class LensAbsentee (actually shipped with VW 2.0, which came out in 1993, iirc). LensAbsentee is an abstract superclass (but not part of the "normal" hierarchy, as it's not descended from Object). It's purpose? Why, when you do a DB query using the Lens, you don't get full complex objects - you get - wait for it - proxies for them. When you actually try to deal with them, they fault in. Kind of like the way the patent explains it:
A method for providing stand-in objects, where relationships among objects are automatically resolved in an object oriented relational database model without the necessity of retrieving data from the database until it is needed. A "fault" class is defined, as well as fault objects whose data haven't yet been fetched from the database. An object that's created for the destination of a relationship whenever an object that includes the relationship is fetched from the database. When an object is fetched that has relationships, fault objects are created to "stand-in" for the destination objects of those relationships. Fault objects transform themselves into the actual enterprise objects—and fetch their data—the first time they're accessed. Subsequently, messages sent to the target objects are responded to by the objects themselves. This delayed resolution of relationships occurs in two stages: the creation of a placeholder object for the data to be fetched, and the fetching of that data only when it's needed. By not fetching an object until the application actually needs it, unnecessary interaction with the database server is therefore prevented.
So hey - you four "brilliant" patent holders - there's prior art staring you in the face (and I'm sure that there are older things than this - TopLink for Smalltalk predates the Lens, iirc). Do any of you have the integrity to admit it?
Update: I had pulled the patent links from this page, which apparently had them wrong. The links are fixed now, so that you can follow the absurdity in all its glory.
Share
web
October 11, 2005 8:01:05.765
I think the key thing to bear in mind about Yahoo blog search is this comment from the CNet story:
Initially, Yahoo News Search will have access to material from hundreds of thousands of blogs but will eventually scan millions, said Joff Redfern, a director in Yahoo's search unit.
Which is why the results are (thus far) disappointing. I think the launch really amounts to a public beta.
Share
analysts
October 11, 2005 12:41:19.493
We have an annoying (and seemingly inexplicable) issue with the Media Center PC. Every so often, it will stop supporting sound to or from the TV. Other sound works fine, leading me to think it's the tuner card. Rebooting always solves the problem, but it's annoying. Anyone seen this, or have an idea what I should look for?
Share
blog
October 11, 2005 12:44:16.213
The blog has been mostly inaccessible for the last hour or so - it was a scaling issue with one of the early things I did in the server code. At the bottom of the page here is a list of referers. I have Smalltalk code that generates that, and it has been running in the same image that serves the blog pages. The problem? Traffic (especially spam traffic) is up - so having a process that read log files, filtered them (in memory) and then spit out the cleansed files to be read by the server was a little too much - each time the log scan code ran, it was slowing the server down - and finally, today, just making it inaccessible.
The answer, of course, is to move that code out of the main server and run it as a cron job - which is what I have to do now.
Share
blog
October 11, 2005 14:32:58.358
A fair bit of the new traffic here is actual new readers - but there's a disturbing amount of attempted referer spam as well. The vast majority is offers for various drugs (the same ones advertised on TV), gambling, and of course, that perennial favorite, porn.
We have some filtering going on at the Apache level now to address that, and I've got the new process for dealing with referers coming up. What a bundle of fun this is :/
Share
rss
October 11, 2005 16:46:04.292
Share
development
October 11, 2005 18:21:27.046
Until you actually try to decouple it. Earlier today, I had a server issue that related to a process - scanning for referers - that needed to be run outside the main server. Fine; turning that off was simple. As it happens, running it separately surfaced a whole raft of little assumptions I'd made along the way.
It took a bit of effort to make the scanning service truly standalone - it was grabbing various bits of information (file locations, etc) from the blog settings information. Running independently, I didn't really want to saddle it with all that extra dreck, so I had to decouple that. Took a fair bit of trial and error to find all my assumptions too.
Bottom line - decoupling services is always harder than you think it will be.
Share
media
October 11, 2005 18:27:26.530
Wow - I knew that newspapers were losing readers steadily, but I didn't realize just how bad it is - their readership is actually dying off.
Newspaper readership is down. Fewer young people are picking them up, and the average age of a newspaper reader is now 55, according to a Carnegie Corporation study. Many papers have been losing circulation at alarming rates across all age groups.
Newspaper profits and the stock prices of the companies that own them were also down during the first half of 2005. The biggest newspapers are cutting staffs, closing foreign bureaus and taking other steps to meet their owners' profit goals.
An average age of 55? Wow. That's got to alarm the finance guys.
Share
spam
October 11, 2005 22:22:21.965
Well, I'm tired of rewarding the spammers via the referer lists. Instead of putting that list at the bottom of the posts on the site, I've moved it back behind a POST - only the blog owner/admin can see them now, after logging in. It won't stop the flood of spam, but it will stop rewarding it.
Share
spam
October 12, 2005 8:16:44.371
Charles Johnson reports that referer spam attacks are up:
Behind the scenes, there is a pretty amazing swarm of robots hitting our Most Recent Referrers page tonight, using zombie servers (servers infected with a previous virus that leaves a back door open) with a range of proxy IP addresses, many in China, to try to plant URLs among our referrers that link to the usual dreary list of illicit pharmaceutical products. This kind of idiot spamming is a constant annoyance, but tonight’s robot swarm is extraordinary for its sheer volume.
That was the problem that took this site offline for a bit over an hour yesterday, and resulted in my moving the referer list behind a post form, accessible only to the author of a blog. I guess a new assortment of bots is out there being played with.
Update: Those are the same blasted spam referers we're seeing.
Share
web
October 12, 2005 10:19:54.910
Yes, gada.be is a cool search aggregator - and now that the servers are in order, it brings results back pretty darn fast. Still, there's something missing that I need to make it useful for me (YMMV, of course) - syndication ready results. For instance, here's a gada.be search for BottomFeeder - but I have to be in my browser to see that. The problem? I don't want to be in my browser, I want to be in my aggregator. Right now, I have a variety of search feeds from a bunch of different engines. If gada.be provided results in RSS or Atom form, I'd be able to cut a lot of those back. As it stands, having those results live in HTML in a browser makes it far less interesting to me.
Share
smalltalk
October 12, 2005 10:47:22.891
Share
marketing
October 12, 2005 11:29:21.612
Share
itNews
October 12, 2005 12:37:16.906
Here's good news - Yahoo and MS are working together on IM, allowing their networks to talk. At present, the various IM systems are like independent, unconnected phone networks - a set of isolated silos, with the AIM one being the biggest. Maybe this will generate enough momentum that AOL will be forced to respond. Let's hope so.
Share
spam
October 12, 2005 15:50:58.193
Sylvie Noël is seeing the same thing I am - splogs are starting to choke the various blog search engines:
If you've got a PubSub account, you've probably come across these in the returns from whatever search term you've put in. I find them very annoying, as they drown out the few interesting new blogs that PubSub sometimes throws my way. In fact, it's destroying the usefulness of PubSub for me.
It's not just PubSub either - Feedster is being run over by splogs, and those blasted ads that Feedster is returning (as full items) are annoying as heck. I'd much prefer to see an ad tacked onto an item - the bozo ad items are not a lot better than spam. Technorati is getting washed and waxed by splogs too - add that to their frequent inaccessibility, and you have a service that's getting less useful all the time.
The damage just spreads...
Share
history
October 12, 2005 15:57:12.277
Here's a site worth looking at if you are a history buff - WWI photos, in color.
Share
movies
October 12, 2005 19:03:42.212
My friend Mike pointed out Orson Scott Card's review of Serenity - I agree, it's a great movie - the sort of movie that makes you realize how good Star Wars could have been if Whedon had been in charge.
Share
general
October 12, 2005 19:06:39.214
Mark Watson compares this process to MS' update, and calls it simple:
Why can't Microsoft make upgrades this easy. A few caveats: Ubuntu is not officially releasing "Breezy" until tomorrow, so I did this on my laptop (which is not my main Linux development system): In the Synaptic package manager, under Settings -> Repository, I manually edited my repositories changing all occurrences of "hoary" to "breezy" and I removed the install CDROM as a repository source. I then clicked the "Mark All Upgrades" taskbar icon and then clicked "Apply" - when asked, I chose the "Smart Mode" upgrade that apparently is meant for upgrading to new releases. One particularly great thing: under "hoary", I had to build and install my own driver for the RT2500 wifi device in my laptop and manually start it. After the upgrade, wireless is on with no manual operations. Note that with the RT2500, when booting Windows XP, I have to manually start wireless.
I don't know about you, but any process that involves manually editing configuration files and then building a driver isn't "easy", and doesn't compare favorably to Windows Update. or the Mac updater either.
Share
gadgets
October 12, 2005 19:28:45.052
PVRBlog has the scoop on something I find really interesting - the possible evolution of iTunes into an Apple Media Center:
The new iMac + Front Row package looks pretty similar to the first versions of Microsoft's Media Center XP. You have simple access to your music, photos, videos, and DVD player, all from a small iPod-like remote. It doesn't look like they're concentrating on sending the video to another room or to a larger screen, but if you live in a small apartment or dorm room and don't need to send video out to a larger screen, backing away from your iMac and using the remote could be a pretty good solution for an entertainment PC.
If Apple comes out with PVR capabilities, I'd get one of those instead of another Media Center PC. The Media Center PC has been too flaky.
Share
humor
October 12, 2005 21:09:09.687
When procedural hairspray just doesn't cut it anymore :)

Share
gadgets
October 12, 2005 23:19:38.609
Scoble on the iPod video:
One thing, though. Steve Jobs better never tell me we're copying him next time I meet him in the street. Why? Cause he brought out a video-playing computer (we call those Media Centers) and a portable video-playing device.
I'll bet that there will be one crucial difference - the Apple version probably won't drop audio for no apparent reason, like my Media Center PC does.
Share
PR
October 13, 2005 9:11:15.555
Tom Murphy of PR Opinions has some thoughts on "identity management" - i.e., knowing what people are finding out about you via the search engines. As he points out, this can be particularly interesting if you have a common name:
This online reputation ecosystem was brought home to me recently in a personal way. My parents, God bless them, weren't the most imaginative when deciding on my name. It's a proud family name, but Tom Murphy isn't exactly exotic. Indeed a quick search finds a playwright, the mayor of Pittsburgh and thousands of other similarly named individuals. We all have the same problem. There was an analyst at Meta Group (R.I.P.) called Tom Murphy and for years we used to receive each other's media queries. It's funny we now both work at Microsoft and the confusion has continued unabated.
But in the past week or so, the media in Ireland and the UK have been focussing in on an unsavoury Tom Murphy or to give him his full title, Tom "Slab" Murphy(no relation). He is the alleged chief of staff of the IRA and has been linked with some dodgy propertydealings in the UK amongst other things. The story has been on every TV news bulletin, radio bulletin, broadsheet, tabloid and online news service over here. A friend of mine joked that soon I'd be getting a lot more "respect". Although there's little likelihood that we'd be mistaken for each other, and of course he could take major offence at being mistaken for a PR practitioner, it illustrates the vagaries of online reputation.
I don't get associated with anyone that interesting, but have a look at a Google search for me - I'm hitting the top two slots again, with the Column Two guy in the third and fourth - and some consultant in the fifth slot. Mind you, I don't know either of those two guys, but both are in jobs similar enough to mine that there could be confusion from people who don't already know the one they are looking for.
The more interesting name match is the judge, who's there if you scroll down. In various name searches I have set up in BottomFeeder, that guy comes up because decisions he makes from the bench sometimes hit the news.
Which all goes back to what people will think when they Google you. The first problem, of course, is the possibility of getting the "wrong you". Which could make for a bad introduction before you even meet. What's the answer to that? I have no idea, honestly.
Share
PR
October 13, 2005 9:22:14.726
Russell Beattie demonstrates how hard it is for a company to shed a bad reputation - he's very, very wary of Microsoft, even when they are doing generally good things:
Microsoft Bribery- Microsoft is using it’s $60 billion cash reserves to buy out everyone who it has stomped over in the past decade. Sun, Netscape (AOL), Novell and now Real. Do they really think that cleans up their image? The value they get from their illegal monopolistic practices far outweighs the pitances they’re paying out in renumerations. This round of settlements is just clearing up loose ends so they can start another round of aggressive business tactics (look at Sendo for a recent example). They’re also doing things like “embracing” open standards like RSS, opening up their Office doc XML stuff, licensing their mobile OS to Palm, and doing IM interop? Sorry, it all looks good, but is mostly an effort on MS’ part to improve its public image, no less. As soon as they can crush their current competition, they’ll be back to their same old ways.
Now, the general buying public doesn't share the nasty image of MS that a lot of tech folks do, which is why the generally bad reputation they built up didn't do more harm. Still, Russ' post shows how persistent that "first impression" can be. With many people, you may never get a chance to create a better one.
Share
tv
October 13, 2005 11:16:22.945
Derek points out where the new downloadable TV thing that Apple (and ABC) are rolling out could go:
Where's the win, though?
Consumers have wanted "a la carte" cable programming for a while. Instead of being forced to buy bundles of 120 channels to get the 6 they want, they've wanted the ability to buy just those channels and (more importantly) pay for just those channels.
This has the potential for changing this dynamic even further, allowing people to buy their showsa la carte, and to eliminate many middlemen in the process.
This is something much bigger than the PVR - it's disintermediation hitting the entire broadcast and cable business between the eyes. It should be an interesting couple of years coming up.
Share
marketing
October 13, 2005 15:30:06.160
It's all about branding and excitement. My daughter is only vaguely aware of the Media Center PC, but she's excited as heck about the new stuff Apple announced. First thing - MS needs better names. Second thing - the Media Center PC needs to be far easier to use. The wife and I went through too much pain to get ours working. If Apple gets into this game with a consumer friendly device, the Media Center PC will be a goner.
Share
development
October 13, 2005 17:42:16.973
Interesting article on "Higher Order Messages" here - what's HOM?
A higher order message is a message that takes another message as an "argument". It defines how that message is forwarded on to one or more objects and how the responses are collated and returned to the sender. They fit well with collections; a single higher order message can perform a query or update of all the objects in a collection.
Nat Pryce goes on to give an example in Ruby. The frothing reference? Here it is:
The higher order messaging version does have messy dots between messages, but unfortunately that's an aspect of Ruby we can't change. At the risk of sounding like a frothing evangelist, I have to admit that the code would be neater in Smalltalk
And he gives a Smalltalk example. The interesting thing is, Michael did some HOM work in VisualWorks a couple years back - I can't find any posts from him about it, but you can load the HigherOrderMessaging package from the public store.
I just like the idea of being a frothing evangelist :) Maybe I'll bring some Alka-Seltzer to my next speaking engagement so that I can really play that up :)
Share
gadgets
October 13, 2005 22:37:21.819
We are really, really losing patience with the Media Center PC. When we first bought it, the price and feature list sounded really good. Then reality set in, with the various "interesting" setup problems. We finally got it working, and got things hooked up to the TV. That's when the fun really started.
Every day or two, the blasted thing just stops recording without sound. The bizarre thing is, sound isn't off on the PC when this happens - heck, the stupid media center application still beeps as you walk through it, trying to figure out why the heck the sound didn't get picked up with the show.
We had been considering buying another one of these as a replacement for our dying ReplayTV. Now, I'll just wait for Apple to ship something in this space. I rather suspect that it will actually work.
Share
travel
October 14, 2005 0:54:21.020
I'm heading up to New Jersey for a customer meeting in the morning - a quick ride up to Newark on Amtrak, and then the same thing back again in the late afternoon. I'll likely be network free until I get back.
Share
WebServices
October 14, 2005 12:02:44.624
Ted Neward echos the conventional wisdom on CORBA - that lack of interop killed it:
For starters, Steve Vinoski was a bit miffed at the idea posited by Mark Baker that CORBA failed. Sorry, Steve, I have to say it, but I agree with Mark--CORBA never fulfilled on its intended promise of seamless middleware interoperability and integration capabilities, and certainly not over the Internet in any meaningful way. By the time CORBA began to address some of those issues--firewalls being a big one--the world had already pretty much abandoned both the "distributed object brokers" (the other being COM/DCOM) and were starting to explore HTTP as the be-all, end-all transport protocol.
Lack of interop was never really the problem - I've seen the VisualWorks CORBA broker working against a large variety of brokers, including a few that no longer exist. Ted touches on the right answer - firewalls. WS* succeeded where CORBA failed for a very, very simple reason - port 80. To get a CORBA hookup between two entities, you have to go have a discussion with the IT (and, if your outfit is big enough, IT security) guys and get them to open up a port in the firewall. Their default answer is going to be no, so this takes work. Easier to just forget the whole thing.
Now, take WS*, by contrast. Well - the SOAP posts come straight into the already open port 80, so you don't need to have that talk with IT. This makes it far, far easier for various skunkworks projects to get going before anyone notices them - by which point they might be too important to kill off.
The WS* stack is at least as complex (and, at this point, arguably more complex) than CORBA ever has been. It's no more or less interoperable between service brokers either (technology-wise). It's effectively more interoperable solely because it uses port 80.
Share
web
October 14, 2005 12:03:06.536
Here's an article on tagging problems from September - I meant to comment on it then, but I find myself looking at my flagged posts now that I'm on a train. Oddly enough, that relates to the problem at hand. Here's the scenario that the post lays out as problematic:
Let's say Joe reads a new article about a battery technology breakthrough in the Scientific American. Joe has been thinking about buying a fuel-efficient car lately. When Joe goes to tag the article's web page, he uses the following tags: "battery," "fuel-savings," "car," "future-vehicle." Let's say the article comes with a .gif of a high-level schematic for how the battery works. Joe saves the .gif in his Flikkr account, tagging it with "battery," "schematic," and "fuel-savings."
Eighteen months and many tags later, due to Joe's profession as an engineer at Intel, he has an electric moment and realizes the battery tech breakthrough has more relevance to something he's directly working on, in nano-tech. Given the keywords he chose, will he be able to 1) recall how he tagged the original article, to find it later on or, 2) if he can find it at all, will he be able to easily re-tag the article and the schematic .gif to match the new context in which Joe finds these ideas relevant? I wouldn't bet on either outcome.
That is a problem, and it's one most of us run into a lot. I use del.icio.us to tag posts that I want to be able to find later - I use the tag "cst" for posts that I want to share with people about Cincom Smalltalk. Now, the problem I'm going to run into here isn't the same as the one above - I'm not going to forget the tag. However, over time, I'll tag a whole ton of things that way. Once I have tens (never mind hundreds or thousands) of posts tagged that way, how do I find the needle I actually want in that haystack?
The article suggests that refactoring tools (like Smalltalk's refactoring browser) for tag libraries are the answer. I don't think so. There's a wall of inertia that's going to prevent most people from doing that. Heck, the simpler problem that my title references is that most people won't tag their posts at all. Of the ones that do, a smaller subset will be motivated to refactor.
Don't believe me? Well, let's look at two A-Listers as an example - Scoble and Winer. The former never categorizes a post, and the latter rarely bothers with a title or a category. These two are widely read, and deeply involved in "web 2.0" discussions - and even they can't be bothered to take the minimal amount of action necessary to enable it. How likely do you think it is that the average web user will bother? For your answer, walk into anyone's old video cassette library and see how many of the ones recorded at home actually have a label. The answer will be enlightening.
Here's more evidence: I subscribe to 315 feeds as I write this, and I keep a fairly large cache of old items for each one. Let's trawl through those and see which ones have a category set:
RSSFeedManager default getAllItems size.
That tells me how many items I have sitting in memory. The response? 16,466. Now, let's see how many have no category set:
(RSSFeedManager default getAllItems select: [:each | each category isNil or: [each category = 'None']]) size.
The result there? 10,810. Nearly two thirds of the items I'm tracking have no category associated with them. Now, let's walk back to the web 2.0 discussions where the semantic web heads are trying to decide whether RDF, or OPML, or something else is the best way to make sense of all this. I'll make it simple for them - it just doesn't matter. The problem isn't the one posited in the article - i.e., "how did I categorize that item"? It's "holy smokes, I'm awash in a sea of completely uncategorized plain text!". Before someone chimes in that text search will auto-categorize, I'll point out that engines like Google already do a lot of that - and, as Scoble has been noticing, there are limits to that.
Share
spam
October 14, 2005 19:28:16.538
Tim Bray wants to know what the point of splogs is:
I suspect most people never see spamblogs, but let me tell you, there are a lot of them out there and they get weirder and weirder and weirder. I’m actually baffled as to why they exist.
Oh, this is a simple one. Set up a search feed in your favorite aggregator, and then watch what comes back - especially from PubSub and Feedster, which are just filled with those right now.
Share
logs
October 14, 2005 22:22:37.927
End of the week, and time to have a look at the logs again - BottomFeeder downloads are back up - looks like last week was the blip. This week: 813 per day:
| Platform | BottomFeeder Downloads |
| Mac 8/9 | 1567 |
| Mac X | 1359 |
| Windows | 799 |
| HPUX | 777 |
| Update | 430 |
| Sources | 339 |
| Linux x86 | 190 |
| CE ARM | 109 |
| Solaris | 28 |
| Windows98/ME | 26 |
| Linux Sparc | 25 |
| AIX | 23 |
| Linux PPC | 8 |
| SGI | 7 |
| ADUX | 4 |
| CE x86 | 3 |
| Source Script | 3 |
Wow, the Mac numbers went way up - I wonder what that's about? Interesting, and the word I hear about the VW VM getting better on the Mac in 7.4 (December of this year) is welcome news, given those numbers. Next, a look at the HTML page accesses:
| Tool | Percentage of Accesses |
| Internet Explorer | 52.5% |
| Mozilla | 40% |
| Other | 3.4% |
| MSN Bot | 2.3% |
| Google Bot | 1.8% |
That's a fascinating jump up in IE hits - traffic has been up, and there was a huge spam wave last week - and most of the spam reports itself as IE. So, I don't think I can read anything into those numbers. Let's have a look at the RSS access:
| Tool | Percentage of Accesses |
| Mozilla | 23.1% |
| BottomFeeder | 15.4% |
| Net News Wire | 11% |
| Other | 8.9% |
| Feed Demon | 4.4% |
| Internet Explorer | 4.3% |
| Safari RSS | 4.3% |
| Planet Smalltalk | 3.7% |
| NewsGator | 3% |
| Magpie | 2.9% |
| RSSReader | 2.6% |
| SharpReader | 2.5% |
| BlogLines | 2.1% |
| Feed Reader | 1.6% |
| BlogSearch | 1.5% |
| Liferea | 1.4% |
| RSS Bandit | 1.2% |
| Jakarta | 1.1% |
| Google Bot | 1% |
| JetBrains | 1% |
| Feed Tagger | 1% |
| RSS 2 Email | 1% |
| News Fire | 1% |
The RSS feed accesses don't show the same IE spike, which tells me that it's almost certainly the spam surge. The variety of tools being used in this space is still huge though - the consolidation that started has not run its course yet.
Share
management
October 15, 2005 10:49:29.126
Scoble thinks an acquisition can change MS' image (which basically means changing its culture). He has it backwards:
Oh, and all it would take to completely remake Microsoft’s image? One acquisition. I hear we have $60 billion in the bank. I don’t want all of it. Just a small percentage. In fact, it’ll cost far less than it cost us to settle with Real to get in this game.
MS is huge, which means that nearly any acquisition will be of a smaller entity. Smaller entities simply don't have the power to change a larger corporate culture. One of three things happens:
- The staff of the acquired entity leaves, since things are "too different" for them
- The staff of the acquired entity "gives up", becoming more like the acquirer
- The smaller entity goes quiet, hoping to stay "under the radar" of corporate - thus maintaining a semblance of their old culture
Unless the two entities are roughly the same size - in that case, you get what amounts to civil war for an unpredictable period of time, as each side tries to "win". I've seen that one personally, and watched it in customers. It's not pretty, and it doesn't help anyone.
Odds are, MS wouldn't have that, because a merger of near equals is hard to imagine for them. Any smaller entity they get will just be swallowed whole, with the corporate culture enforced over it. Chance of image/culture change for MS in all this? About nil.
Share
rss
October 15, 2005 11:05:48.302
Looks like AOL is trying to jump into the blog search game - Steve Rubel has the story:
AOL and Intelliseek on Monday plan to unveil a blog content deal. Sue MacDonald at Intelliseek confirmed that the deal - set to be announced Monday at 7:30 a.m. - will give AOL access to rich blog data that they will deliver to consumers. While MacDonald did not say what specific data AOL will get, one can certainly speculate that it will come from BlogPulse and reside on the new AOL.com site.
I've been fairly happy with the BlogPulse results - they don't seem to contain the volume of splog content that is making Feedster and PubSub less useful every day.
Share
management
October 15, 2005 11:34:06.169
I continue to get requests along the lines of "is Smalltalk safe for the next 20 years?". I've got this post out there, which I think sums things up nicely. However, an article written by John Dvorak illustrated to me again just how hard it is to peer into the near future (much less any further out). I can't find the article in Dvorak's PC Mag archive yet - it'll probably appear there next week. When you look for it, the title is "Computers and Modern Anarchy". Dvorak got into a whole thread about control and anarchy that doesn't interest me a lot - but he did make a point along the way that I wanted to highlight:
If you were running a nexus point or a BBS, you had to have huge banks of modems and multiple phone lines to receive a user on your "site". Most users today can probably no longer configure or use a modem. Dial-up is automatic, and it dials the internet, not each individual target.
Imagine how you surf the web today and realize that before it existed, you had to get the phone number of the site and call it directly each time. There was no hyperlinking; if you wanted to jump from site A to site B, you would have to hang up on one site and dial another. This was standard practice a mere 13 years or so ago.
Think about that - I remember how excited I was about getting USENET access via one of the BBS systems back then - and I remember the large amounts of money a roommate spent on chat too (something that is completely free with AIM, MS Messenger, etc. today). In the early 90's, it was a completely different (online) world.
Now take that forward - what are things going to look like in 15 years? From 1990, I sure wouldn't have seen what's here now. I seriously doubt that anyone sees 2020 clearly either.
Share
games
October 15, 2005 19:01:31.491
Civ III is much more difficult than the old "Civ" (the DOS game) was. I used to play that at the King or Emperor level; I finally managed to win a game at "Warlord" level this afternoon, and I only managed that by staying on Monarchy, relentlessly building military units, and whacking the other powers until they died. How the heck would you win a game via the space race route? I have no clue...
Share
tv
October 16, 2005 1:12:38.975
We missed "Lost" on Wednesday - our misbehaving ReplayTV was on the fritz. So, we caught up this evening. In the middle of the episode, Hurley starts talking to Rose about his problem (the food found in the hatch). Jack walks by, and says hi to her.
At this point, the wife and I are saying - whoaaaa - she's dead! She died last season, I thought. I went hunting around the episode guides, and found this - "White Rabbit" last year, episode 5:
Jack is nearly delirious from lack of sleep and struggles to overcome the haunting events that brought him to Australia and, subsequently, to the island. Meanwhile, Boone gets caught in a treacherous riptide trying to save a woman who went out swimming. A pregnant Claire's health takes a bad turn from lack of fluids, and a thief may have stolen the last bottles of water. Veronica Hamel guest-stars as Jack's mother, Margo. Also, Jack (Matthew Fox) flashes back at 12 years old, to find himself on the playground in an altercation with a bully, who ultimately beats him up, and later learns a life lesson from his father.
I'm certain that the woman who Boone couldn't save was Rose - and now here she is, back in the flesh, and no one remembers that she's dead? What's up with that? Did I miss something, or is this a clue that will become plain later?
Share
spam
October 16, 2005 11:11:23.844
Chris Pirillo says what the rest of us have been thinking for awhile now:
In the past few days, I've been inundated with an enormous amount of subscribed search spam for designated keywords. To the tune of hundreds, if not THOUSANDS, of bunk entries. Who knew "lockergnome" and "pirillo" would be THAT popular?! Still, I can't help but think that others are having the same headaches - and 99% of the crap coming in is directly from a single domain: blogspot.com. Google, it may have been a smart acquisition in the beginning, but y'all need to clean house in a big way. You're the tallest nail, and you're really getting pounded - and now others, who aren't even using your service, are getting pounded. Blogspot has become nothing but a crapfarm, and your brand is going to go down with it. If your motto truly is to do no evil, then you need to start putting some resources behind an effort to curb this train wreck.
I don't know what's (specifically) making it so insanely easy for these spammers to get signed into your system, but you need to change that - ASAP. Forget about developing another Web-based aggregator for now (sorry, Shellen - Blogspot needs more help at this point). I'd love to ban / filter anything and everything that comes from blogspot.com, but the problem is that I have quite a few friends on that service who are sitting in the 1% "legitimate" minority.
As to why spammers go for that system, that's the simple part. It's free, and the signup process can be easily scripted. Which means that you can bot the whole thing, and create a universe of splogs within a few minutes. It's been a disaster waiting to happen, and now it's happening in a huge way.
Update: The sheer volume of these things is amazing. Every one of the blog search systems I use heavily for feeds - Technorati, BlogPulse, IceRocket, PubSub, Google Blog Search - they all get gamed by these splogs. Blog search just got a lot harder.
Share
spam
October 16, 2005 11:57:39.473
It occurs to me that the next target for link spammers will be del.icio.us. They have a simple RESTful interface, and it's easy enough to set up an account. All I have to do is wait, I suppose
Share
spam
October 16, 2005 14:32:57.410
Tim Bray noticed that spam blogs have just exploded - here's an example - look at the latest results for this Feedster search feed for Smalltalk. There have been 76 new items (some dupes) since midnight. Of those, 4 are actually what I'm looking for (references to Smalltalk, the programming language). Two are false positives, references to "smalltalk", as in speaking. The other 70? All splog results.
It's clear that these are bots at work - the Blogspot templates are the same for all the results. The funny thing is, the products and services being flogged aren't your typical pharmaceuticals and porn - the "offers" are all over the map. This seems to be a well organized and coordinated attack, using a boatload of fake blogs as the delivery vehicle.
Looks to me like it's time for Google to step up.
Share