October 2007


After much anticipation an amazing new Patent retrieval tool launched yesterday. SparkIP is an amazing new patent search tool of which my colleague (he is my boss-man really) Tim Lenoir is a founder. SparkIP combines the robust on-the-fly clustering of search results similar to Vivisimo’s Clusty but with a pretty incredible twist. The search engine results are navigated by the user in a visual way. Results are clustered, and first the user is presented not with patent results per se but rather with patent cluster results. The company refers to each cluster as a “SparkCluster Map.” Each of these cluster “maps” have numerous clusters within them. This set of cluster maps (shown here)

SparkIPLandscape

referred to as a landscape, is an excellent and robust way of reducing often-overwhelmingly-sized relevant document results while providing complex visual information about each cluster. This is truly a forward-looking tool in many respects but particularly in terms of generating intelligent and useful information about technologies, people, and institutions related to a keyword search. SparkIP has raised the bar on information retrieval right here. But your search is not done yet.

Given the landscape you can then select any of the specific cluster maps (seven in all were returned on “text mining”) by clicking directly on the map graphic. I selected the second cluster map, “information retrieval.” This then brings an enlarged view of the cluster map revealing the clusters within the map, shown here:

SparkClusterMap

Then clicking onto one of the map nodes/clusters (I selected the “document information retrieval” node at the very center of the cluster map) you see a view called “Technology Detail” (shown below):

TechDetail

More information-overload-reducing brilliance on display here in SparkIP. First, note that while 61 patents were retrieved, only 10 were returned. Further, there are likely hundreds more patents relevant to “text mining.” What appears to be happening here is that SparkIP has developed patent-filtering heuristics “under the hood” that get rid of the high volume of junk patents cluttering any patent database. After all, many if not most patents are created by their originators for purposes other than to stake a claim on a highly specific technology. Many a business game is played with patents as the pieces. An organization might want to try and occupy an intellectual property space to see if it can land licensing suckers. Other patents are premature. Some others overreach or are incredibly vague and therefore unenforceable. And so on.

There are a number of small problems with the interface as with many a beta product. The back buttom removes you entirely from your search results rather than helping you navigate backwards from, say, technology detail view to cluster map view. The meaning of visual iconography such as cluster map node size or color, while intuitive, are not altogether clear just from naively using the tool.

But wait folks, that’s not all. In addition to keyword-to-landscape patent search SparkIP will also open up an eBay-esque marketplace for intellectual property. I don’t know of that part is already live or not. I hope to have more time to play around with the site in the coming days.

SparkIP was founded at Duke University through collaboration between Dr. Lenoir, current Pratt School of Engineering Dean Rob Clark, and John Hopkins Provost and Senior President of Academic Affairs Kristina Johnson. Since joining Lenoir at Duke I’ve had a couple of small windows of opportunity to provide some technical advice on cluster metrics with SparkIP engineer (and allpatents.org founder) Kevin Webb. But I never even got to see a demo of this thing. And let me tell you, man, this thing is amazing. I put this tool right up there with Clusty and the TRIP evidence-based medicine site as a retrieval tool among the best since the arrival of Google beta.

Congratulations to you Tim, and to you Kevin, and to the rest of the SparkIP team.

Last evening during the weekly Duke FOCUS cluster meeting we enjoyed a talk from Duke OIT AVP and Croquet principal architect Julian Lombardi. Julian is also aligned with ISIS at Duke which is where I enjoy the opportunity to teach on occasion. I can’t say enough how much of a neat guy Julian is, or how his presentation on Croquet was absolutely fascinating. Suffice it to say he had my head nodding in agreement and his ideas were controversial enough to get the Freshman in the room to make smart-alecky remarks. If that’s not a positive sign of innovation I don’t know what is.

I promised that this would be a note, and it will be a note. I promise.

Julian asked those attending his presentation last evening why we use computers that are overpowered and undercollaborated (to coin a word), why we use machines with seemingly prehistoric interface tools like a mouse and keyboard. Further he asked why we don’t have better technologies that work better with the way we work. I’m not sure how he answered this question except to say that we need to engineer software that supports “deep collaboration,” as Julian called it. I think Julian was suggesting that we were sort of stuck in our ways and that we just weren’t picking up available technologies, sticking instead to old guns.

I don’t think the problem is that simple. In fact I suspect there are two significant problems, one intellectual, the other sociological.

The intellectual problem is that I suspect few if any actually understand what “deep collaboration” really is. It appears to me that we are only starting to understand collaboration as a phenomenon, and then only a phenomenon of a digital variety, and then only through data about how people use collaborative technologies. That type of understanding seems to be a sort of cart-leading-the-horse phenomenon.

I don’t think (but I certainly do not know for certain) we have very good understanding of phenomenae such as tacit knowledge, communities of practice, activity theory, and so forth. Do we possess a very good grounding insofar as understanding how people work together?  How they have worked together?  How people might work together?

Funny that Julian called the web pages “brochures.” He’s right, they are brochures. I love that perspective he shared as it made me laugh and then blush.  It also appears to me that we’re a pamphlet-publishing culture in general so web-publishing activities seem to actually comprise an adequate reflection of the way we seem to work. After all, where in our culture are we not engaged in this sort of pamphlet-publishing work mode?  You’re going to have to go far outside information technology in order to respond (e.g., construction). It appears to me that knowledge workers of all sorts work in a rather linear fashion and this is perhaps not surprising since our concepts of ourselves as subject arises from engaging in linear tasks such as writing, reading, watching, all coded in terms of first-person perspective.

Collaboration at least in US technoculture seems to find its apex of sophistication in the assembly of multiple independently-produced pamphlets. We can see this even in open-source software development projects where repositories are open to collaboration. With tools like CVS we lock out as we write to a file and resolve conflicts with any other code-pamphlets that have been written concurrently.

So the intellectual problem appears to have at least three components each of which should be explored independent of computational technology: knowing how we know how to work together; knowing how humans have worked together in the past; and divining how we might expand on the combination of past collaboration modes and knowledge of tacit knowledge to innovate new collaborative paradigms. I think this is an area ripe for intellectual innovation, and I don’t think such an effort should be limited to software engineering.

If this sort of intellectual problem has already been conquered then I admit I just completely missed it.  But I currently see that there is a huge gap between our understanding of the cognitive dimensions of collaboration and the understanding of how people use, say, Facebook, to collaborate with one another.  What is the biology, the phenomenology, and the behavior of human interaction?

The sociological problem is that simply such innovative interfaces have lacked, for a huge number of reasons, crossover to early adopters. Who are early adopters? The cool hip techies to whom the masses look for what’s hot, what’s cool, those who bellwether their intellectual and geographic locales. Those of us who are into inventing are not very good at engineering social transitions and we don’t make early adopters at all. And when we lack early adopters we lack, well, adoption itself, don’t we?

Was this just a note?