12 January 2005

[ R ] Search, PubSub, 13-Jan-05 UTC 00:28:26

Error

Unreliable date-time-stamp data on subscriptions; invalid time-sequences, inconsistent search results of same-parameter search strings.

Issues

Subscription-creation and -modified data logs-records-reports one set of data that does not match the order that subscriptions were actually created.

Overview

It appears as though PubSub internal timing mechanism has a problem. Searches and subscriptions are entered in one sequence, but then self-report their times as created in another order.

For example if two subscriptions were created, Pub Sub would not necessarily log the times in the same order. Entered subscription #1 could be first, but actually logged and stamped at a time after the second [#2] subscription was created.


Detailed Error Modes


Content Related

  • XML subscriptions initiated with the same parameters yield inconsistent results;

  • Known-published XML content is not uniformly found wittin a single XML search tool;

  • Same XML feed creates inconsistent search results;

  • Saved XML feed-URL/URI in PubSub generates inconsistent results;

  • PubSub-generated XML-feed URL/URI-results do not match same XML-URI/URL-generated results from other platforms;

    Time Related

  • Times that "sequential publications are created" do not match the order the times were subsequently logged in PubSub;

  • XML URIs are logged with one set of date-time parameters, while the actual creation and order of these subscriptions could be different or another order;

  • The date-time data on a subscription is not reliable as an indicator of when the subscriptions were created or which order the subscriptions were created;


  • Discussion

    PubSub allows end-users to store their searches/subscriptions in the PubSub box. Users can access PubSub matches by hitting the link from this saved subscription-link. PubSub displays the results.

    Users may rely on the PubSub-box as a platform to generate search results to their subscription. Users also have the option to take the XML-URL/URI which PubSub generates in place this XML-URL/URI in another platform or host [aggregator, bookmarks, or directly into other content].

    We conducted a test of the PubSub subscription process.


    Test Steps
  • 1. Create draft document
  • 2. Do not publish or ping Ping-O-Matic with an update of that document
  • 3. Review the PubSub subscription standards; ensure your search terms, phrasing, and quotations are correct
  • 4. Generate a subscription in PubSub with target phrases from that draft document
  • 5. Save the XML-feed URL/URI in another linking device like a bookmark-host
  • 6. Confirm your URLs in both PubSub and the bookmark-host match
  • 7. Wait [insert an arbitrary waiting period] to allow PubSub time to register your subscription and being looking for your target phrase
  • 8. Publish your content-document with the target phrase
  • 9. Ping PubSub
  • 10. Track how long it takes for your PubSub XML-URL/URIs to produce a result
  • 11. Compare the search results between your PubSub-box and the XML-feed URL/URI you saved in book marks.


  • Test Results

    The two XML-feeds were both created and saved at the same time. However, the results as saved and reported in PubSub do not reflect this.


    Search String 3A

    Bookmark link:
    PubSub Subscription
    Last updated: 2005-01-12T19:48:57-05:00
    There are no messages in this digest-- yet.
    There are no results that match your subscription-- yet.
    PubSub Link:
    PubSub: 3A. D Blog
    Last updated: 2005-01-12T17:37:29-05:00
    There is 1 message in this digest.
    Observation: The XML-feed URI-URL final codes are not similar. The book mark-link has .xml, while the PubSub subscription [which returned a result] has no similar .xml in the URL-URI.


    The only way to get the "bookmark link" is to get it when initially creating the subscription. In other words, if the user truly "made an update" [as reflected in the bookmark link], the user would reasonably have also an update-time that was matching or later in the PubSub.

    If the above data is correct, this would create an unusual chain of events:


  • First, a search was saved; and a subscription created
  • Second, a bookmark was saved at the time the site was created
  • Third, PubSub was updated
  • Fourth, time passes.... [Wait, wait]
  • Fifth, the bookmark was updated, without changing the PubSub subscription time
  • No evidence within PubSub of an update

    Assuming, for the sake of argument, that the above conditions are related only to the end-user, the problem with the above order of events is that it defies logic: How could one create a new link for a bookmark, but then not create an update in the PubSub?

    The reasonable chain of events would be the reverse: That one would [if they wanted to] update the PubSub subscription [as evidenced by the later time in the bookmark link, where the link is visible and can be copied], one would also create a change in the PubSub subscription time. This did not happen.

    If there was an actual update, we might expect to see some time difference between when the subscription is created, and when the subscription self-reports that there has been an update. However we find no evidence of this.

    Note the following data contained in the XML-feed-URI which produced the positive result:


    Published: 2005-01-12T 09:57:19-05:00
    Modified: 2005-01-12T 09:57:19-05:00
    The times match. If there was a modification within the subscription, there "should be" a difference between the published time and the modified time.

    Theories on the inconsistencies

    Clock code

    Either PubSub is producing inconsistent results; or the internal clock linked with the XML-URI [associated with the bookmark] doesn't match the clock used to generate the subscription.

    Perhaps the code in one area checks one register which is timed with one reference point; while in another area of the code [associated with the link-generation], the code checks a different set of commands. Alternatively, one section of the code references the time in one manner, while another section references reference the time in either a different way or not at all.

    Search initiation

    Another explanation is perhaps there is a caching issue in that PubSub will generate an XML feed for the user to copy at time X; while the actual "created time" within PubSub is not registered until the backlog of other searches is up, and the newly created subscription gets to the head of the queue and PubSub starts searching.

    If this is true [we have no idea], lets look at the time difference between the bookmark time and the PubSub time:


    Bookmark: 2005-01-12T 19:48:57-05:00
    Saved PubSub search: 2005-01-12T: 17:37:29-05:00
    We have the opposite. The bookmark time is two hours later than the saved PubSub search. So, this theory is problematic.

    If the time difference is explained by the backlog and the subscription is not registered as "created" until the actual searches being, this indicates that there is a two hour lag time between when the subscription was first entered into PubSub [to create the link for the bookmark], and the time that PubSub actually started to conduct the search.

    That's quite a backlog. Yet, such an explanation is problematic in that the backlog in other situations has been negligible. There was no requirement to wait for any period of time between creating the subscription and returning content.

    If true, it would represent a significant degradation in the time one could process a result. Only a few weeks ago, the results were quick; if true, this would imply that the rate of growth in the PubSub subscriptions was either growing faster than expected; or the residual searches, although deleted, are somehow taking up more room than expected suggesting that the hardware requirements are much higher than initially planned.

    Conduct another review

    The above time-inconsistency does not occur in a subsequent search. In this case, two additional searches were initiated, URL-s saved, subscriptions started:




    Search String 5

    String 5A
    PubSub 5A: Last updated: 2005-01-12T 20:51:13-05:00
    Bookmarks 5A: Last updated: 2005-01-12T 20:52:07-05:00

    String 5B
    PubSub 5B: Last updated: 2005-01-12T 20:53:07-05:00
    Bookmarks 5B: Last updated: 2005-01-12T 20:52:51-05:00

    Note: The above times are close, within seconds are apart, indicating that the subscription was created, XML-link saved, and PubSub saved the terms all within a close period of time.


    Now, the big surprise. Look back at the original times. Notice one was created later than the other. There was a two hour lag in the book marks.

    Also, note in search string 5, the time is around 20:53. Search string 5 was done before search string 3. So, if we look at another view of Search string 3, we should have times all before those of search string 5.

    Surprise, we don't.


    Search String 3B


    PubSub: Last updated: 2005-01-12T 21:04:08-05:00
    Bookmark: Last updated: 2005-01-12T 21:07:40-05:00


  • Strings 3A and 3B were both created at the same time, but there is a four hour gap [17:37 vs 21:04:08].

  • String 5 was actually created before strings 3, but reports later.

    Final check

    In the event there was a problem with any of the previous searches, we created a wildcard. This is another way of saying, "If the content, searches, parameters were in error, or something failed in either searches 1, 3, or 5, we would use string X as the back-up to check: Times, orders, baseline searches, and other assumptions in the baseline tests.

    Specifically, as a final confirming check, we conducted another search. Again, the above procedures were closely followed. This test simply confirmed whether or not the above search strings were valid.

    As expected, the times match closely in both the PubSub and the saved Bookmark, but interestingly produce no search result.


    Search String X


    Bookmark -- Last updated: 2005-01-12T 21:47:38-05:00
    PubSub -- Last updated: 2005-01-12T 21:50:56-05:00

    Note the time is 4 hours different than the original Search String 3


    With one exception: Same search string, same parameters, but now the results are empty.
  • In other words, nothing has changed in the PubSub search string, but the outputs do not match across similar searches.

    Again, the quotations match, input parameters for the subscription are both the same. The only difference is the result.

    Curiously, note that the time of Search string 3 [09:57:19, 9AM] is now 12 hours earlier than search string X [21:50:56, 9PM]. Recall, the entire search-strings-sequences [1, 3, 5, and X] were inputted and saved within a short time, at most 1 hour.

    Review

    A search created and saved after another is now reporting as being created prior to that string. Search String 5 was created prior to search string 3; but PubSub reports 3 was created prior to 5:


    Bookmark 3A: 19:48
    PubSub 3A: 17:37

    PubSub 3B: 21:04
    PubSub 5: 20:53


    Summary table


  • PubSub took data a time 1 and clocked it. But then self-reported that it took the data at a different time.

  • PubSub allowed a user to create multiple search strings in a sequential order, but then reversed the order the searches were "logged as created".


    String Creation
    Actual Reported

    3 After 5 Before 5
    5 Before 3 After 3


    Questions:

    Why does a "subscription created" generate two different results: Both in subscription content, and in time created?

  • Content

    Why are the XML-links [saved as a bookmark at the time of creating a subscription] not matching the subscription [XML feed stored and used to execute the search in PubSub]?

  • Time-stamp

    Why does the same PubSub subscription generate two different "creation dates": one in PubSub and another associated with the saved-link in a bookmark?


  • Выставки и Сайты. Интересное

    Выставки и Сайты. Интересное
    Error

    Unreliable date-time-stamp data on subscriptions; invalid time-sequences, inconsistent search results of same-parameter search strings.

    Issues

    Subscription-creation and -modified data logs-records-reports one set of data that does not match the order that subscriptions were actually created.

    Overview

    It appears as though PubSub internal timing mechanism has a problem. Searches and subscriptions are entered in one sequence, but then self-report their times as created in another order.

    For example if two subscriptions were created, Pub Sub would not necessarily log the times in the same order. Entered subscription #1 could be first, but actually logged and stamped at a time after the second [#2] subscription was created.


    Detailed Error Modes


    Content Related

  • XML subscriptions initiated with the same parameters yield inconsistent results;

  • Known-published XML content is not uniformly found wittin a single XML search tool;

  • Same XML feed creates inconsistent search results;

  • Saved XML feed-URL/URI in PubSub generates inconsistent results;

  • PubSub-generated XML-feed URL/URI-results do not match same XML-URI/URL-generated results from other platforms;

    Time Related

  • Times that "sequential publications are created" do not match the order the times were subsequently logged in PubSub;

  • XML URIs are logged with one set of date-time parameters, while the actual creation and order of these subscriptions could be different or another order;

  • The date-time data on a subscription is not reliable as an indicator of when the subscriptions were created or which order the subscriptions were created;


  • Discussion

    PubSub allows end-users to store their searches/subscriptions in the PubSub box. Users can access PubSub matches by hitting the link from this saved subscription-link. PubSub displays the results.

    Users may rely on the PubSub-box as a platform to generate search results to their subscription. Users also have the option to take the XML-URL/URI which PubSub generates in place this XML-URL/URI in another platform or host [aggregator, bookmarks, or directly into other content].

    We conducted a test of the PubSub subscription process.


    Test Steps
  • 1. Create draft document
  • 2. Do not publish or ping Ping-O-Matic with an update of that document
  • 3. Review the PubSub subscription standards; ensure your search terms, phrasing, and quotations are correct
  • 4. Generate a subscription in PubSub with target phrases from that draft document
  • 5. Save the XML-feed URL/URI in another linking device like a bookmark-host
  • 6. Confirm your URLs in both PubSub and the bookmark-host match
  • 7. Wait [insert an arbitrary waiting period] to allow PubSub time to register your subscription and being looking for your target phrase
  • 8. Publish your content-document with the target phrase
  • 9. Ping PubSub
  • 10. Track how long it takes for your PubSub XML-URL/URIs to produce a result
  • 11. Compare the search results between your PubSub-box and the XML-feed URL/URI you saved in book marks.


  • Test Results

    The two XML-feeds were both created and saved at the same time. However, the results as saved and reported in PubSub do not reflect this.


    Search String 3A

    Bookmark link:
    PubSub Subscription
    Last updated: 2005-01-12T19:48:57-05:00
    There are no messages in this digest-- yet.
    There are no results that match your subscription-- yet.
    PubSub Link:
    PubSub: 3A. D Blog
    Last updated: 2005-01-12T17:37:29-05:00
    There is 1 message in this digest.
    Observation: The XML-feed URI-URL final codes are not similar. The book mark-link has .xml, while the PubSub subscription [which returned a result] has no similar .xml in the URL-URI.


    The only way to get the "bookmark link" is to get it when initially creating the subscription. In other words, if the user truly "made an update" [as reflected in the bookmark link], the user would reasonably have also an update-time that was matching or later in the PubSub.

    If the above data is correct, this would create an unusual chain of events:


  • First, a search was saved; and a subscription created
  • Second, a bookmark was saved at the time the site was created
  • Third, PubSub was updated
  • Fourth, time passes.... [Wait, wait]
  • Fifth, the bookmark was updated, without changing the PubSub subscription time
  • No evidence within PubSub of an update

    Assuming, for the sake of argument, that the above conditions are related only to the end-user, the problem with the above order of events is that it defies logic: How could one create a new link for a bookmark, but then not create an update in the PubSub?

    The reasonable chain of events would be the reverse: That one would [if they wanted to] update the PubSub subscription [as evidenced by the later time in the bookmark link, where the link is visible and can be copied], one would also create a change in the PubSub subscription time. This did not happen.

    If there was an actual update, we might expect to see some time difference between when the subscription is created, and when the subscription self-reports that there has been an update. However we find no evidence of this.

    Note the following data contained in the XML-feed-URI which produced the positive result:


    Published: 2005-01-12T 09:57:19-05:00
    Modified: 2005-01-12T 09:57:19-05:00
    The times match. If there was a modification within the subscription, there "should be" a difference between the published time and the modified time.

    Theories on the inconsistencies

    Clock code

    Either PubSub is producing inconsistent results; or the internal clock linked with the XML-URI [associated with the bookmark] doesn't match the clock used to generate the subscription.

    Perhaps the code in one area checks one register which is timed with one reference point; while in another area of the code [associated with the link-generation], the code checks a different set of commands. Alternatively, one section of the code references the time in one manner, while another section references reference the time in either a different way or not at all.

    Search initiation

    Another explanation is perhaps there is a caching issue in that PubSub will generate an XML feed for the user to copy at time X; while the actual "created time" within PubSub is not registered until the backlog of other searches is up, and the newly created subscription gets to the head of the queue and PubSub starts searching.

    If this is true [we have no idea], lets look at the time difference between the bookmark time and the PubSub time:


    Bookmark: 2005-01-12T 19:48:57-05:00
    Saved PubSub search: 2005-01-12T: 17:37:29-05:00
    We have the opposite. The bookmark time is two hours later than the saved PubSub search. So, this theory is problematic.

    If the time difference is explained by the backlog and the subscription is not registered as "created" until the actual searches being, this indicates that there is a two hour lag time between when the subscription was first entered into PubSub [to create the link for the bookmark], and the time that PubSub actually started to conduct the search.

    That's quite a backlog. Yet, such an explanation is problematic in that the backlog in other situations has been negligible. There was no requirement to wait for any period of time between creating the subscription and returning content.

    If true, it would represent a significant degradation in the time one could process a result. Only a few weeks ago, the results were quick; if true, this would imply that the rate of growth in the PubSub subscriptions was either growing faster than expected; or the residual searches, although deleted, are somehow taking up more room than expected suggesting that the hardware requirements are much higher than initially planned.

    Conduct another review

    The above time-inconsistency does not occur in a subsequent search. In this case, two additional searches were initiated, URL-s saved, subscriptions started:




    Search String 5

    String 5A
    PubSub 5A: Last updated: 2005-01-12T 20:51:13-05:00
    Bookmarks 5A: Last updated: 2005-01-12T 20:52:07-05:00

    String 5B
    PubSub 5B: Last updated: 2005-01-12T 20:53:07-05:00
    Bookmarks 5B: Last updated: 2005-01-12T 20:52:51-05:00

    Note: The above times are close, within seconds are apart, indicating that the subscription was created, XML-link saved, and PubSub saved the terms all within a close period of time.


    Now, the big surprise. Look back at the original times. Notice one was created later than the other. There was a two hour lag in the book marks.

    Also, note in search string 5, the time is around 20:53. Search string 5 was done before search string 3. So, if we look at another view of Search string 3, we should have times all before those of search string 5.

    Surprise, we don't.


    Search String 3B


    PubSub: Last updated: 2005-01-12T 21:04:08-05:00
    Bookmark: Last updated: 2005-01-12T 21:07:40-05:00


  • Strings 3A and 3B were both created at the same time, but there is a four hour gap [17:37 vs 21:04:08].

  • String 5 was actually created before strings 3, but reports later.

    Final check

    In the event there was a problem with any of the previous searches, we created a wildcard. This is another way of saying, "If the content, searches, parameters were in error, or something failed in either searches 1, 3, or 5, we would use string X as the back-up to check: Times, orders, baseline searches, and other assumptions in the baseline tests.

    Specifically, as a final confirming check, we conducted another search. Again, the above procedures were closely followed. This test simply confirmed whether or not the above search strings were valid.

    As expected, the times match closely in both the PubSub and the saved Bookmark, but interestingly produce no search result.


    Search String X


    Bookmark -- Last updated: 2005-01-12T 21:47:38-05:00
    PubSub -- Last updated: 2005-01-12T 21:50:56-05:00

    Note the time is 4 hours different than the original Search String 3


    With one exception: Same search string, same parameters, but now the results are empty.
  • In other words, nothing has changed in the PubSub search string, but the outputs do not match across similar searches.

    Again, the quotations match, input parameters for the subscription are both the same. The only difference is the result.

    Curiously, note that the time of Search string 3 [09:57:19, 9AM] is now 12 hours earlier than search string X [21:50:56, 9PM]. Recall, the entire search-strings-sequences [1, 3, 5, and X] were inputted and saved within a short time, at most 1 hour.

    Review

    A search created and saved after another is now reporting as being created prior to that string. Search String 5 was created prior to search string 3; but PubSub reports 3 was created prior to 5:


    Bookmark 3A: 19:48
    PubSub 3A: 17:37

    PubSub 3B: 21:04
    PubSub 5: 20:53


    Summary table


  • PubSub took data a time 1 and clocked it. But then self-reported that it took the data at a different time.

  • PubSub allowed a user to create multiple search strings in a sequential order, but then reversed the order the searches were "logged as created".


    String Creation
    Actual Reported

    3 After 5 Before 5
    5 Before 3 After 3


    Questions:

    Why does a "subscription created" generate two different results: Both in subscription content, and in time created?

  • Content

    Why are the XML-links [saved as a bookmark at the time of creating a subscription] not matching the subscription [XML feed stored and used to execute the search in PubSub]?

  • Time-stamp

    Why does the same PubSub subscription generate two different "creation dates": one in PubSub and another associated with the saved-link in a bookmark?


  • Выставки и Сайты. Интересное

    Выставки и Сайты. Интересное
    " />