28 December 2004

Search, Aggregators, 28-Dec-04 UTC 14:21:01

Summary

Low confidence in ability to search for known, specific XML feed-content:

  • Known content fails to appear in searches;
  • Certain aggregators are more suited to searching archives, yet not all XML-feed content is adequately archived or searchable.

    It remains to be seen what percentage of XML-feed-content exists but cannot be searched, located, or referenced for purposes of [a] monitoring web content or end-user feedback; and [b] implementing corporate responses or development efforts to address these particular comments.

    It appears corporations must rely on the "volume of comments" to ensure that a given beta-error or failure mode is reliably communicated back to corporate.

    This does little to inspire confidence that XML feeds are a reliable early-warning system. Rather, XML feeds are merely another warning system. As we have seen in Asia, a warning system, to be reliable, must provide information at the first notice, not when the problem is self-evident.

    Discussion

    This search tested whether known content actually appears in aggregator searches: Technorati, Feedster, Bloglines, and NetNewsIsFree.

    Recall, PubSub only searches future content after a specific search string is entered, therefore we remove PubSub from this particular test.


    Test 1: XML Feed Title Search

    This test identified known-existing XML-feed-titles and searched for this XML feed with the four aggregators.

    Question: Does a known XML-feed title appear in the aggregator search tool? Generally not.

    Test Search Target

    We selected from the known Feedster content. Then cut text from one selection.

    Results

  • Technorati: Available.

  • Feedster: No result.
    "We're sorry but a serious error has occurred. This problem has been emailed directly to engineering and should be addressed shortly."
  • Bloglines: Nothing.
    Your search - "Spinning corporate monitoring of blogs" - did not match any results.

    All Bloglines search options, except the "search the web option." Was not located in either "my subscriptions" [it should be]; nor was it visible in the "Search all blogs" option
  • NetNewsIsFree: Nothing.

    No results for quoted content.
  • Google: Available.


  • Quirk with Feedster

    Feedster in the monthly archive includes titles; but the daily selections do not include these titles.

    Could the aggregators be stripping out the titles and only allowing "searches for content"?


    Test 2: XML Feed Content Search

    This test focuses on content, not titles.

    Question: Does the known content in known XML feeds appear in the four aggregators or Google? Generally not.

    Search target

    We choose from Feedster's archives, "Corporations are just spinning the benefits of blogs because" in this blog

    Results

  • Technorati: Success.

  • NetNewsIsFree: Nothing.
    produced no matching result, but plenty of "other stuff" unrelated to the primary objective.
  • Feedster glitch, but successful.
    produced a result, but then transitioned to a "The page you are looking for is currently unavailable"-window; solved by simply backspacing.
  • Bloglines: Nothing.

    produced no result, even when "searching the web". This is strange as Google similarly produces no result.

  • Remarks

  • Gaps: This test demonstrated substantial gaps in the search capabilities and archiving of known content. There is a definite difference between aggregators. Some appear to retain searchable content longer than others.

  • New-content-search difficult: My overall concern is that if we look at the aggregators as a spectrum of "feed search capabilities", we have to know where the feed is located in time in order to know which aggregator-search will be most effective.

    This is analogous to "already knowing what you are looking for in order to know how to find it." That's not how XML feeds have been advertised. They're saying the opposite -- that you can find new information you didn't know existed.

    I'll believe it when I see the "searches for known content" consistently occur. This is analogous to celebrating a search engine, but failing to demonstrate that the search engine actually works.

  • Archiving: Remains to be understood the "window" that content remains visible under the particular aggregators. Perhaps some retain content longer before dumping it.

    Conclusion

    Searching for XML feeds through either the aggregators or the Google search engine is hit or miss. Google and Technorati, not the aggregators, remain the most reliable historical XML-feed search-tools.

    If I had to choose an XML-feed search tool, I'd go with Technorati; however, because not all blogs are listed nor necessarily have feeds, I'm inclined to start with Google.
  • Summary

    Low confidence in ability to search for known, specific XML feed-content:

  • Known content fails to appear in searches;
  • Certain aggregators are more suited to searching archives, yet not all XML-feed content is adequately archived or searchable.

    It remains to be seen what percentage of XML-feed-content exists but cannot be searched, located, or referenced for purposes of [a] monitoring web content or end-user feedback; and [b] implementing corporate responses or development efforts to address these particular comments.

    It appears corporations must rely on the "volume of comments" to ensure that a given beta-error or failure mode is reliably communicated back to corporate.

    This does little to inspire confidence that XML feeds are a reliable early-warning system. Rather, XML feeds are merely another warning system. As we have seen in Asia, a warning system, to be reliable, must provide information at the first notice, not when the problem is self-evident.

    Discussion

    This search tested whether known content actually appears in aggregator searches: Technorati, Feedster, Bloglines, and NetNewsIsFree.

    Recall, PubSub only searches future content after a specific search string is entered, therefore we remove PubSub from this particular test.


    Test 1: XML Feed Title Search

    This test identified known-existing XML-feed-titles and searched for this XML feed with the four aggregators.

    Question: Does a known XML-feed title appear in the aggregator search tool? Generally not.

    Test Search Target

    We selected from the known Feedster content. Then cut text from one selection.

    Results

  • Technorati: Available.

  • Feedster: No result.
    "We're sorry but a serious error has occurred. This problem has been emailed directly to engineering and should be addressed shortly."
  • Bloglines: Nothing.
    Your search - "Spinning corporate monitoring of blogs" - did not match any results.

    All Bloglines search options, except the "search the web option." Was not located in either "my subscriptions" [it should be]; nor was it visible in the "Search all blogs" option
  • NetNewsIsFree: Nothing.

    No results for quoted content.
  • Google: Available.


  • Quirk with Feedster

    Feedster in the monthly archive includes titles; but the daily selections do not include these titles.

    Could the aggregators be stripping out the titles and only allowing "searches for content"?


    Test 2: XML Feed Content Search

    This test focuses on content, not titles.

    Question: Does the known content in known XML feeds appear in the four aggregators or Google? Generally not.

    Search target

    We choose from Feedster's archives, "Corporations are just spinning the benefits of blogs because" in this blog

    Results

  • Technorati: Success.

  • NetNewsIsFree: Nothing.
    produced no matching result, but plenty of "other stuff" unrelated to the primary objective.
  • Feedster glitch, but successful.
    produced a result, but then transitioned to a "The page you are looking for is currently unavailable"-window; solved by simply backspacing.
  • Bloglines: Nothing.

    produced no result, even when "searching the web". This is strange as Google similarly produces no result.

  • Remarks

  • Gaps: This test demonstrated substantial gaps in the search capabilities and archiving of known content. There is a definite difference between aggregators. Some appear to retain searchable content longer than others.

  • New-content-search difficult: My overall concern is that if we look at the aggregators as a spectrum of "feed search capabilities", we have to know where the feed is located in time in order to know which aggregator-search will be most effective.

    This is analogous to "already knowing what you are looking for in order to know how to find it." That's not how XML feeds have been advertised. They're saying the opposite -- that you can find new information you didn't know existed.

    I'll believe it when I see the "searches for known content" consistently occur. This is analogous to celebrating a search engine, but failing to demonstrate that the search engine actually works.

  • Archiving: Remains to be understood the "window" that content remains visible under the particular aggregators. Perhaps some retain content longer before dumping it.

    Conclusion

    Searching for XML feeds through either the aggregators or the Google search engine is hit or miss. Google and Technorati, not the aggregators, remain the most reliable historical XML-feed search-tools.

    If I had to choose an XML-feed search tool, I'd go with Technorati; however, because not all blogs are listed nor necessarily have feeds, I'm inclined to start with Google.
    " />