Dead or Alive – the future of ‘hashtags’

A couple of weeks ago, Scoble had a epiphany that ‘hashtags are dead…’. That epiphany was really more about realtime search than the future of hashtags. If anything, inline tags (hashtags) are going to be an increasingly important aspect of the realtime web.

Inline tags are simply the combination of a unique character and a keyword (e.g. #tag).  Introduced to Twitter by Chris Messina back in 2007, hashtags have seen increasing usage around current events, conversation and collaboration. Here’s what makes them uniquely valuable, particularly in the public micro-message:

  • Natural, fast, easy
    Requiring only an additional keystroke they become part of natural practice, do not require additional action after typing and are not dependent on the medium.
  • Explicit
    Effectively reorients the rest of the message as context around that keyword, and enables explicit, permission-less participation in a public thread.
  • Threading
    Tags link messages together in topical threads which allow for asynchronous public conversation. These threads are automatically related to contributors, other threads, and links with the message text forming a tight contextual wrapper around the thread  itself.
  • Universal
    Being text based, they can be used anywhere you type

Hashtags are just the beginning of in-line tagging in public micro-messages. They will enable explicit threading and permissionless participation in the realtime web in a natural and extensible way. Chris Messina’s original post had some great details, some of which which I believe will be part of the core infrastructure of the realtime web. And as public micro-messaging services proliferate, inline tags will help enable cross-platform threading with the potential to weave the web and even our offline data.

Reblog this post [with Zemanta]

15 thoughts on “Dead or Alive – the future of ‘hashtags’”

  1. I wouldn't really call Scoble's proclamation an “epiphany”; I don't think we've ever used #hashtags terribly well.

    The openness of hashtags is its greatest strength and its greatest weakness; I believe using #hashtags will only be popular and well-understood when it makes economic sense for a person to use it; i.e. only when using a hashtags helps someone (promote themselves, be understood better, quicker, easier, help participate in a conversation) will people use them regularly with any rigor.

    Meaning:
    1) Given that the ability to understand the context of messages (micro- and macro-) through natural language search and contextual analysis (or through semantic web-type architecture) is difficult…
    2) … we depend on users to use hashtags to self-identify important parts of messages….
    3) … but there isn't any real meaningful need for people to use hashtags until people can privately capture the externalities behind free public metadata.

    Until threads are meaningful (e.g. public, searchable, indexed, promotable), #hashtags are useless.

  2. I definitely think how we use inline tags will change as their usage
    generates value for us. i don't think it has to be 'economic' value in terms
    of financial, but of personally relevant value — e.g. being a basis for
    reorienting what the web delivers to me.

  3. Unfortunately hashtags will run into the same problem that is bedeviling image libraries: the need for a controlled vocabulary. Explicit tagging with an # at least avoids the problem of search returning results where a tagword is used inadvertently or irrelevantly. #sxsw works because it is an agreed-upon standard. But what about #travel #travels #traveling #travelling, #color and #colour, etc. Image libraries which need to (manually) keyword images to enable search results have long dealt with this in two ways – standardized keywords (a controlled vocabulary, usually hierarchical) and stemming (search for #frog returns #frogs).

    Obviously metadata is essential to define the meaning of content for search, but the variability of our language (let alone the ambiguity of expression and homographs) will make this always an imperfect art.

  4. Perfect example; the controlled vocabulary of many image libraries hinders comparison and sales due to the the variability of language and minds of buyers and sellers; perhaps a mix of photographer-supplied keywords and searcher-supplied image comparison (via a TinEye-style reverse image search) would help people search for images?

  5. I think efforts to make topical in-line tags (e.g. #tag) formal and controlled will both diminish their value and ultimately fail. I think there is value simply through the use of the # to indicate specific importance in a message. In real-time messaging i don't think we're looking for 100% accuracy and association. If a tweet gets dropped in a conversation thread it's not a big deal. Frequency is what builds relevance in the medium and with enough frequency tags eventually become related. For search we're looking more for patterns and evolution than we are precision. And if users have to think too much about what they tweet, usage will go down which i think will decrease overall value/potential.

  6. I'm not sure it's a perfect example. The value in the image is the image. I want to find an image to use.

    When i do a twitter search on a tag, i'm not looking for a tweet to use, i'm looking for the conversation and the context of who's contibuted and what else is related.

  7. As Michael suggested there are really different varieties of search:

    With image search, usually one is not looking for a specific image known in advance, but rather shopping for an image they don't know which will meet their needs. Stock agencies optimize their sites for that. Their goal is that the buyer finds a satisfactory image, that they make the sale, not that the buyer finds THE one image. Consequently the site's aim is to present a selection of the most topical, salable, desirable images. Noise is a huge problem; unwanted images can send the buyer elsewhere.

    Images in a managed image library like Getty come from multiple sources, sometimes already keyworded, so standardized disambiguating tokens are essential. A search for “orange” color is different from “orange” fruit or “Orange” location. With controlled vocabulary the search engine will also apply synonyms so a search for glad will return images keyworded happy if the two are linked in the lexicon. The point is that the buyer does not have to know the standard terminology, but the keyworders do.

    Alamy has taken a different approach, with an unedited collection and 16+ million images and a small support staff. There's no standard terms so a photobuyer has to look for multiple synonyms and use qualifiers like <orange fruit> to disambiguate. With no editorial review there's lots of room for error and lots of noise in the results. Worst case is the buyer goes away empty handed when satisfactory images in the file are not returned (or are on page 500).

    Google has the worst of it, since Google Images has no keywords for images, just filenames and adjacent text on the web page. The don't even read IPTC and embedded metadata. (Now that would change things completely!!!)

    Without human intervention SE's are at a disadvantage since they are blind – they have no idea what the image subject is at all. An image named dog may be of a horse.

  8. Interesting – this is helping me get at the difference between meta-data and
    in-line tags.
    When using an in-line tag we are associating the new content with existing
    content.

    When tagging an object we are only modifying existing content.

    Using an in-line tag means you are actually contributing to the content
    itself (in the case of hashtag), or contributing to a contextual wrapper (in
    the case of a link). Account references are a bit of both with a bit more
    conversation.

    The other piece is that a person's twitter stream builds a contextual
    wrapper around themselves that has linked them to people, topics, things
    that are interesting to them – not just the things but their contextual
    wrappers.

    So rather than being about proactive search, I see the opportunity in this
    being augmented discovery and relevance filtering. It's more about the
    interweave of relationships than about the objects themselves.

  9. I think metadata is any information ABOUT some other entity, object, person or piece of information. As you note it can be something which is actually part of the entity, or something added on to it. And it can be created at the time the entity created, or added afterward by the creator or someone else.

    A hashtag can be just part of the content which is marked as having a secondary use as a label for the subject of the entity, or it can be tacked on.

    For example:

    Groucho perfected the art of #humor

    What is the difference between a duck? #humor

    A link isn't really metadata itself, but a pointer to another entity or bunch of info. A link often has two parts, the visible label like CNN which is usually part of the content, and the actual http or IP address pointer. We usually lump them together in our mind. Thus:

    I read it on CNN

    I read it on http://www.cnn.com

    I read it on CNN

    The hashtag and the link are similar in one way, they both point to something else. The hashtag #humor refers to a general related term (which could be used in search for example) and the link http://www.cnn.com points directly to an external entity which has additional info.

    There's a key difference though.

    #humor adds meaning to the entire Tweet

    http://www.cnn.com adds meaning only to the link label CNN

    A wrapper specifically refers to additional info surrounding an entity that adds context. For example, a simple email message is “meet me at 10 at Starbucks” . The email header is a wrapper adding the from, to, when information and delivery details etc.

    The way you are using it, however, “contextual wrapper” is really just “context”

    The tweet content itself is just:

    “Groucho perfected the art of #humor”

    The tweet context however would be:

    “Groucho perfected the art of #humor” along with links, search results and more info about:

    the sender (ie. davidsanger)
    the explicit hashtag ( #humor )
    inferred subjects (i.e. Groucho Marx)

    You could even extend it to info about:

    trending topics at the time the Tweet was sent
    retweets, online references, links to the tweet, comments and trackbacks
    Other similar phrases or expressions

    The whole package or bundle would then become a context.

    We don't have a very good way to express or generate such a context. We just kind of wing it as we need to.

  10. David – excellent… thanks for that explanations. Best I've seen.
    What's interesting to me in all of this is the web of context… the
    relationship between the context of different entities. Do you see unique
    properties in the way that it is created now in twitter? It's somehow in
    that that I believe there is unique opportunity for determining relevance
    between the entities. And I think it has to do with how people tweet –
    frequently, spontaneously about what is of interest in the moment that
    contributes to what is unique about it.

  11. Michael – I have started playing around with Calais as a metadata analyzer and it is promising. The document viewer is easiest intro (using cut and paste) but they also have a FF plugin Gnosis. So far it doesn't interlink with Twitter IDs but it does identify people places links etc mentioned. You can analyze a Twitter hashtag search like #humor

    Of course the same technology with access to the Twitter API to analyze relationships is what you are talking about.

    See also the Brian Solis writeup on The Ties that Bind Us – Visualizing Relationships on Twitter and Social Networks

Leave a Reply

Your email address will not be published.