30 December 2004

Search, PubSub, 30-Dec-04 UTC 17:25:09

Limitation

PubSub cannot search for nor return XML feeds with search terms embedded in HTML comments.

Test Discussion

We pinged the PubSub at 30-Dec-04 UTC 16:51:30 with the following blog-entry. In advance of this ping we also saved multiple key words from the blog entry. This blog entry discusses the results of this test.

Questions

Is there a minimum amount of time that PubSub needs as an advance notice for a search?

Can PubSub read embedded links at the "title" command in the HTML links?

Test preparation

We created a blog entry with embedded links and comments in the HTML "title" format.

Before publishing the blog, we entered key phrases into the PubSub list.

Then we published and pinged the pubSub test.

Then we tested the known phrase that was not embedded to see if the target blog was visible. Indeed it was.

Test Results

Non-embedded phrase appeared in PubSub: "Did you notice the Yahoo-image forwarded to blogger"

Result: A response: This blog returned as a valid hit.

This tells us that we've waited long enough: That the PubSub has has enough time to identify the target blog-spot, the blog is searchable, and is known to PubSub. Thus, we can continue with confidence with the other tests: Content exists and could be found as long as PubSub has the capability.

Target Phrases

All target phrases were embedded as HTML notes in the blog after "title" in the links.


Target Phrases

Quoted Phrases, Time Last updated

1. "It is there in the blog, you just can see it", 2004-12-30T17:10:51-05:00

2. "Look to the right -- see Publish is not", 2004-12-30T17:12:02-05:00

3. "For more on where this image is from", 2004-12-30T17:12:41-05:00

4. "It is there, but the public cannot see it", 2004-12-30T17:13:06-05:00

5. "Warning: Image mess, may freeze your", 2004-12-30T17:13:33-05:00

6. "Read more about e-feed, or Bloggers e-mail", 2004-12-30T17:14:03-05:00

7. "This is a sample message from a valid", 2004-12-30T17:15:08-05:00

Not contained in parenthesis

8. E-feed is a generic term for E-mail to... , 2004-12-30T17:05:38-05:00


Unable to find terms embedded in HTML title-command comments"

All seven of the embedded HTML notes were not searchable with PubSub. Test 8 also failed to find the target blog-entry.

Conclusions

1. Outstanding: PubSub has minimial lag-time between search initiation and returned results

PubSub was able to find XML-feeds containing ke terms only requested moments before. This is to say that there does not appear to be a signficant lag time between [a] the PubSub-user enters a target search-string; and [b] the amount of time that PubSub needs to cycle through the XML-feeds to return a valid hit.

This is another way of saying that "if PubSub can't find it quickly, there's no use waiting around" as PubSub does not need "more time" to find a specific XML-feed. Of course, users may find other results if they wait.

2. Limitation: Embedded comments are not searchable

  • It does not appear as though PubSub can identify key words in an embedded HTML-link-comment after "title".

  • Key phrases within HTML comments after "title" are not searchable using PubSub.

    This is a limitation which users should be aware. Not all important information is necessarily in plain view. Some bloggers include attribution, detailed comments, and more specific technical terms in the embedded comments.

    It remains to be understood whether other XML-feed-search tools [a] provide this option, or [b] superior search alternatives to PubSub.

    Recommendation

    XML-feed searches would be improved if they not only looked at the "plain text" visible to the eye, but also the embedded notes at the links.
  • Limitation

    PubSub cannot search for nor return XML feeds with search terms embedded in HTML comments.

    Test Discussion

    We pinged the PubSub at 30-Dec-04 UTC 16:51:30 with the following blog-entry. In advance of this ping we also saved multiple key words from the blog entry. This blog entry discusses the results of this test.

    Questions

    Is there a minimum amount of time that PubSub needs as an advance notice for a search?

    Can PubSub read embedded links at the "title" command in the HTML links?

    Test preparation

    We created a blog entry with embedded links and comments in the HTML "title" format.

    Before publishing the blog, we entered key phrases into the PubSub list.

    Then we published and pinged the pubSub test.

    Then we tested the known phrase that was not embedded to see if the target blog was visible. Indeed it was.

    Test Results

    Non-embedded phrase appeared in PubSub: "Did you notice the Yahoo-image forwarded to blogger"

    Result: A response: This blog returned as a valid hit.

    This tells us that we've waited long enough: That the PubSub has has enough time to identify the target blog-spot, the blog is searchable, and is known to PubSub. Thus, we can continue with confidence with the other tests: Content exists and could be found as long as PubSub has the capability.

    Target Phrases

    All target phrases were embedded as HTML notes in the blog after "title" in the links.


    Target Phrases

    Quoted Phrases, Time Last updated

    1. "It is there in the blog, you just can see it", 2004-12-30T17:10:51-05:00

    2. "Look to the right -- see Publish is not", 2004-12-30T17:12:02-05:00

    3. "For more on where this image is from", 2004-12-30T17:12:41-05:00

    4. "It is there, but the public cannot see it", 2004-12-30T17:13:06-05:00

    5. "Warning: Image mess, may freeze your", 2004-12-30T17:13:33-05:00

    6. "Read more about e-feed, or Bloggers e-mail", 2004-12-30T17:14:03-05:00

    7. "This is a sample message from a valid", 2004-12-30T17:15:08-05:00

    Not contained in parenthesis

    8. E-feed is a generic term for E-mail to... , 2004-12-30T17:05:38-05:00


    Unable to find terms embedded in HTML title-command comments"

    All seven of the embedded HTML notes were not searchable with PubSub. Test 8 also failed to find the target blog-entry.

    Conclusions

    1. Outstanding: PubSub has minimial lag-time between search initiation and returned results

    PubSub was able to find XML-feeds containing ke terms only requested moments before. This is to say that there does not appear to be a signficant lag time between [a] the PubSub-user enters a target search-string; and [b] the amount of time that PubSub needs to cycle through the XML-feeds to return a valid hit.

    This is another way of saying that "if PubSub can't find it quickly, there's no use waiting around" as PubSub does not need "more time" to find a specific XML-feed. Of course, users may find other results if they wait.

    2. Limitation: Embedded comments are not searchable

  • It does not appear as though PubSub can identify key words in an embedded HTML-link-comment after "title".

  • Key phrases within HTML comments after "title" are not searchable using PubSub.

    This is a limitation which users should be aware. Not all important information is necessarily in plain view. Some bloggers include attribution, detailed comments, and more specific technical terms in the embedded comments.

    It remains to be understood whether other XML-feed-search tools [a] provide this option, or [b] superior search alternatives to PubSub.

    Recommendation

    XML-feed searches would be improved if they not only looked at the "plain text" visible to the eye, but also the embedded notes at the links.
    " />