10 December 2004

Yahoo's bright outlook and Feed Integration: More than combining feeds, it's about letting the user mix and compare data streams

Aggregation is a lovely feature of XML. Users anywhere can grab the feeds they want.

The next generation of aggregators is already in works. Its more than simply taking the feed and presenting it. The next aggregators are going to allow users to compare baseline standards, actual performance, public representations, and then identify the inconsistencies.

But that's not all. The next step is to then identify and display these deltas in a workable and usable format. One that the end user can swiftly take action on. This note outlines a structure to not only aggregate this information but outline a model developers and users can use as a starting point for a "sample case study."

It is hoped that this hypothetical example is one that serves as a common anecdote from which both the business users and information technology communities might discuss common solutions. Developers led under the direction of expert program managers have the requisite talent, skills, and tools to accomplish this objective. It remains to be seen how soon these tools gain popular support to the extent that the "common end user" like a blogger has access to these tools.

Discussion

We begin first with a description of the baseline assumptions, parameters and working environment. Let us suppose that our pseudo-model has the characteristics of being an isolated system, with simple nominal data streams, and that the number of feeds is small, finite, and closed.

Clearly, these assumptions are absurd given the nature of the internet, but for purposes of discussion it will simplify the model into one that will serve our purposes.

Let us further suppose that time takes on a new phenomena. This is to say that microseconds and nano-seconds are stretched out to months and years. Again, this will be come clear why we choose this construct.

Let us also suppose for the sake of discussion that our isolated feed-universe is characterized by five-feeds. All are coming into the aggregator. And the only focus or interest we have is the five feeds.

Also, suppose that the external world that the five feeds connect to are from separate sources, but these sources may or may not have competing interests.

In this model, we have a simple system characterized by external sources providing inputs into our single aggregator. Also, note that the input feeds are only what we are focusing on.

The purpose of this construct is to move away from the nuances of time, integration, and programmatic issues arising when we talk in terms of millions of feeds moving at nanoseconds.

In short, with this model, we are essentially taking a snap-show of a microsecond, and we're going to draw out the entire logic-trees into step functions. This is done so that we gain insight into the entire system, but only look at the smallest unit of interaction; while at the same time, we acknowledge that the time-extension is purely fabricated.

Logic

However, as you may recall from your advanced coding classes, one of the most important things to consider is the basic logic of the system. This is to say that however convoluted the end-result is, all programs and coding does exactly what it is told to do. Logic is what drives the computer. This is to say that we may extend the time-frame of our analysis to one that appears absurdly long, detailed, and ridiculous. Yet, at the micro-level, the smallest glitch in the code can derail what is otherwise a fabulous construct.

Theory aside, let's consider our model. Five feeds running in. An aggregator. And a simple notion that these five feeds can be forced to interact.

The true absurdity of the current state of affairs on the internet and in feeds is that the goal of the XML system is to ensure that data integrates. But we rarely focus on the end-users' desire to quickly integrate the individual feeds real-time into something that is a composite, new, and beyond what actually existed prior to arrival at the end-users aggregator.

In other words, this model presupposes that the aggregator moves simply from being a platform or host to assembly and display feeds, but becomes a vital tool for the business user to actually force the varying feed-elements [news, blogs, websites] to interact, compare themselves, and take on forced values.

Essentially what this model proposes is that the entire feed aggregator construct be magnified to permit the end-user to have the feeds pre-integrated, pre-packaged per standards that are more than simple filters, but are active players in identifying, prioritizing, and comparing data feeds as they come into the system architecture.

At this juncture, it is clear that aggregate information can easily be assembled at the business-level where by queries can be inputted, various data files access, and queries can be made of the system. We might choose to integrate three bodies of knowledge such as statistics, information technology, and inventory control. This combination of disciplines is not new.

At an XML level, this construct would assume that these three data streams could be also pre-mixed prior to inquiry. This is to say that the business manager could easily take a series of indicators, thresholds, and values and pre-write general codes and run routines to create new combinations. Clearly, this is possible with the state of computing.

What is new, is that the individual consumer would also have this capability when working with feeds. Again, the business user relies on individual queries to access data. However, if the business user can essentially have their varying data-fields probed in three different perspectives [three bodies of knowledge: Statistics, quality control, information technology], then this means that the "logic" behind that capability is discernible.

It is not a quantum leap to imagine feeds being integrated, queried, and tested on the basis of pre-defined user-inputs. And then have those feeds pre-arranged into new patterns, relationships, priorities along user-defined parameters [whatever those might be].

Let's get specific with an example. Let's go back to our general model and then keep in mind that we're dealing with a very elongated time-schedule, and are using only five feeds.

In this model, we might have a simple interest to view baseline data, standards, publicly available information, and then compare those with anecdotal information through other databases.

When I speak of databases in the general model, I'm also using that as an analogy for a feed in the specific case.

In a general sense, value is created when we take raw ill-formed information and create something new. Management also gains leverage when it is able to compare information in new ways and arrive more quickly at conclusions and adjust their schedules, decision points, and indicators used to assess information.

In other words, a new combination of data could very well prove insightful to identify a trend. But the issue isn't simply with "trend recognition" but also then using that "trend" as a single pseudo-feed, then observing how the other feeds interact, or not interact relative to these insights.

Again, we are not talking about creating new information at this point. Merely arriving at a general notion that "feeds" when they arrive can still remain a single feed even though they have been reformatted into a trending pattern.

The next step is the interesting part. This is when we take the aggregator, and force the non-feed-of-interest into a series of tests, analysis relative to set standards within the other feeds.

This is to say that our incoming baseline data-feed can be examined by the other feeds relative to external criteria that the original feed may or may not have been compared to.

The new aggregator-model would then rearragne the feed-data-XML into new orders and patterns relative to our baselines standards.

Non-static baselines

But this is where the trick occurs. Rather than requiring the initial feed to be compared to a static baseline, we then permit that incoming feed to challenge that baseline data to do the opposite: To use that incoming feed as a baseline to challenge the validity of the original standards.

In other words, the current models assume that the baseline assumptions and data-sets currently coded are set. But what if we take the opposite?

What if we permit [as a simple test run] the external feed to be the driver, and our pre-existing coding to be adjusted based on this feed. In other words, we arrive at a model where the platform is not longer static and firm, but one that adjusts and arrives at new trending data....which is then compared to our original data-sets which we might use as a standard.

In other words, we no longer presuppose that a single incoming feed is necessarily more right or more wrong; but we take a neutral approach. What is this feed actually telling us; and our the metrics, logic-theories, and other schema sets applied to our decision rules [already coded] whether these are valid.

Then, we permit the feed to actually take over the system, and monopolize it on the assumption [for a test case only] that all our baseline assumptions are invalid, and that this new feed is actually something that is an indicator of a trend.

At this juncture, we throw away the existing baseline and recalibrate our internal metrics to say, "What is this feed telling us" and "what do we need to do differently" on the assumption that our baseline data is incorrect.

In other words, let us suppose that our current 5 feeds, 1 of which is the "new feed data" are integrated. But that the 4 "non-new ones" are up for grabs. Must they be evaluated in terms of new information?

Again, we don't know this. Because the present platforms make the error of assuming that the existing data-sets are a baseline to measure data. But if we take the opposite view, that the "new feed" is actually the tip of the iceberg on the new trend, we arrive at a different outcome.

This is to say: We then review our existing 4 feeds in light of this new trend, and do stress testing on the very feeds were would normally ignore for purposes of analysis. Again, we are not saying that this is a valid outcome; only that the current systems available to consumers do not permit this level of sensitivity analysis.

In other words, the "baseline feeds" could then be challenged in terms of "are they correct, valid an useful" and "if we were to presume that the initial-incoming feed is the new trend, what outcomes might we anticipate when this 1 feed interfaces with the static four feeds.

The results might be not useful. Or they could actually be something that is novel. This is what makes the feed integration process noteworthy.

Application to the judicial system

Let's get specific. In the judicial system there are baseline rules and schema called statutes and laws. These form the basis of interaction. They are cues to interact.

However, let us suppose that without notice these rules change. Clearly, the public is "presumed to know the law." But suppose the decision maker or magistrate in this case is not aware of the nature of the deviation.

A static model would presuppose that this input is a deviation and must be corrected.

However, if the model presupposes that logic exists, then one might expect that there is some flexibility in how the system responds to this initial input.

The next step would be to take the "jury instructions feed" and compare those with the "observed behavior feed" and then identify the deltas. Then, we take the "hard drive informant-data feed" and compare then with the initial input. Then we identify whether there is a decision, threat, requirement to adjust, or an indicator of a wider trend.

The deviation is may not necessarily be something that is bad. But useful information signaling that the jury instructions have reached their threshold utility; and that the system oversight has failed to adequately regulate those factors that would otherwise report a false positive.

Also, this model would also provide feedback to the legislature and leadership that there was a problem, and something new is required. Not after many months of doing nothing, but that there was an immediate sign of a problem and quick action is needed.

The present models do the opposite. This is the downfall of aggregated systems whereby there's more emphasis on coding and less emphasis on the legal structure required to support that system. This is to say that a specific feed "known as statutes or jury instructions" could be dovetailed with "observed behavior feeds" and "feeds from other sources" to compare whether the deviation is a problem, or simply feedback that our system is outdated.

We do not know. But the current feeds do not do this at the consumer level. Nor do they allow the individual PC-computer user to take many feeds, compare the feeds to arbitrary standards, and then allocate and assign importance and relevance based on these baseline data.

Moreover, the current consumer feeds do not allow the consumer the opportunity to stress test the information, rearrange the elements into new patterns based on a new feed, or deviate from the set standards.

This model presupposes that the current "this is what the codes an inputs say" are static; and that the "consumer-level product" would be enhanced if these rigid rules were abandoned, whereby baseline-comparisons are not longer static, and become something that is equally malleable as the initial feed.

In short, we have feeds competing with each other: Which is most useful; which feed has greater credibility; which feeds are, if the trend is real, indicate that we must immediately abandon the existing framework.

This is not to say that rules need to be abandoned. But that we generalize the approaches and arrive at outcomes that fall well within the existing constraints. Feeds can be used to interact with each other; and we can permit the feed adjust our baseline and constructs used to both analyze the feed and also how we aggregate the individual feed relative to these other baselines.

Again, the consumer does not have this tool.

Let's consider some situations where this feed integration model, when applied, would be superior to the current feed model. Again, we're looking at the artificial feed construct, but we're maximizing our constraints so that the variables when applied are something that goes beyond simple multiple-linear regression model, and is something along the lines of quantum physics.

Multiple feed integration isn't simply laying the information out, but permitting those feeds to interact, and at times allowing a feed to take control of our baseline and stress testing that feed until our model no longer becomes useful.

Application to accounting

Let's dive into a rather arcane area of accounting. Many numbers. Lots of rules under Generally Accepted Accounting Standards. Suppose there was a system whereby a senior audit partner could aggregate all the incoming data feeds, and reshape them into a schema of her own choosing.

This is not to say that the model under GAAS is invalid; rather it is to permit the senior partner to take the incoming data, apply experience, and then dovetail these feeds with the SAS99 fraud indicators.

At this juncture, the partner could very well look at the baseline feed, and at first blush say, "We have no issues." However, as you have probably heard in re Enron-Anderson, there were indicators of problems supplied to the SEC well before the meltdown. These indicators came in the form of early reports in 1993 that the Enron corporation was having some problems. Specifically, Enron had undergone an SEC audit in the early 1980s, and the SEC was charged with a follow-up.

What happened; where were the follow-on results; and how were the subsequent "data feeds of system status" dovetailed with these "new standards placed on Enron" in light of the initial problem? We speculate that these standards were not aggressively applied; with respect to our feed model, it appears as though the "lessons learned of the feed" were not permitted to take precedence over the GAAS standards, or the senior partners planning documents.

Had a model of an "integrated feed" existed, we might have different results. Rather than dismiss the initial signs of problems or explain them away, we might have used the "unusual feed responses" and then reshape the environment in terms of "what new questions do we need to ask" and "if we fail to ask these questions in terms of the potentially new environment" what could be the risk?

This is not to say that audit sampling does not have a purpose. Rather this is to suggest that we permit, for a momentary situation, the chance to allow a single feed to take on a different role, weighting, and structure within the aggregator and then evaluate the range of expected responses once we permit this feed to take superior questions.

Let us apply this model to a systems architecture development effort. Suppose under SAS99, one of the indicators of fraud are the approaches management takes to outside auditors; and at the same time we have known, discrete reports of this conduct occurring.

However, under the existing conditions, we have yet to see how the "many press reports" are explained away; while at the same time in hindsight it all "becomes so obvious."

Applying the hypothetical model

Let's apply the model to this situation. We arrive at new questions. We are compelled to ask new things. For example, using this feed model, we might turn the inquiry on its head and then recalibrate the previous working paper in light of this new feed; and then say, "How does management explain this."

But rather than simply taking their answer and doing a follow-up, we consider their response in terms of previous observations.

Let's be specific. Suppose one audit-feed is the prior audit report. And suppose another data-feed are the management letters sent from the auditors to management. And then suppose there is a third feed which represents the information on the follow-up visit by the auditors to monitor this information and get well.

On the surface, the existing model proves valid. However, here is where things break down. No where is there a check within the system to then aggregate the follow-on results with the pre-audit feeds, and then compare the separate information gleaned by comparing the standards, with the anecdotes, and the "less optimal feed."

At first blush, we take the written information as valid; and when this is brought before a grand jury they simply rely on the evidence.

However, under this feed model, suppose we have a new way of aggregating the information. Permitting the feeds to interact. We might as, "If the management has publicly stated X, but their anecdotal practice is Y, we might assign a feed-weighting to this source that provides evidence of actual practices, as opposed to the documented information which appears to be impeachable.

Moreover, when a case goes to court the prosecutor does not necessarily end the review once final judgment is granted. What the prosecutor will do is simply continue taking anecdotal information from other sources and determine whether investigators or back-stopped informants need to be redeployed in the given area or an analogy.

Let us suppose, for the sake of argument, that a prosecutor has found material information that a public official has engaged in malfeasance, destroyed evidence, and is fabricating information. To prevail in litigation they must present this to the court. But rather than display this information for all to see, it might be useful to the prosecutor if they could take the "informant feed" and the "caselaw feed" and "audit feed" and "management resposne letter feed" to integrate them into a new pattern to discern whether one feed should be given less weight than another.

Integrating already gathered information as a separate steaming feed

In other words, although the case may have litigated and gone to trial with a successful prosecution, the current feed model does not then create a new feed out of this outcome, and then reverse itself to review the historical information in light of the new trends.

For example, such a model might give us new insight into the reasons for an ultimately departure; rather than take managements' statements for what they are, we might then ask, "Is this earlier resignation" indicative of something that was a deviation; and what work was or was not accomplished while dealing with this priority?

Again, we're taking the post-litigation feed, and then using this as the "new feed" to then reexamine the previous data, ask new questions, and arrive at a larger feed of information prompting new questions: Have we believed someone who should be second guessed; have the pre-litigation feed-inputs we have be taken in a new light; how might we reexamine the motivation of public statements or assertions given; what was the basis for hiring this manager; are there other patterns of conduct that we might observe given this new retro-look at the situation.

This is a useful tool for the public to have. Because the public could then discern new patterns not only in the existing data, but then have a backwards look on the assumed patterns that they were led to believe were correct.

In short, the public can do its own stress testing on the data inputs; evaluate whether the information gleaned is correct, valid, useful; and assign their own weighting to the feed source based on the criteria the individual consumer might choose as being more valid, useful, or given weight.

Existing models fall below customer needs

Again, the current feed aggregators do not do this. They simply take the information, display it, and then the consumer is forced to wade through it. What could be a better aggregator is something that does both forward and backward weighting of feeds; have better integration of feeds; and allow the consumer to cast aside or disregard the baseline data constructs in the existing scheme.

This is not to say that the rules are to be abandoned. Rather, it is to suggest that if we arguably live in a logical world based on reason, then feeds could reasonably be expected to integrate in far more dynamic ways.

Going forward, feeds need to be looked at in not just a new way, but in a variety of ways. Neither way is more or less correct. That feed, even if it is invalid, provides information. And the end user can make an informed decision about whether the information on the "status of that feed" needs to be adhered or cast aside.

In many cases, what we have been led to believe is a "wrong way of looking at something" is the desire by management to simply ensure the pre-scripted outcome is achieved, rather than letting the system be held to account when new information arrives indicating that the baseline standards, schema, assumptions, and ways of modeling the information prove invalid and less useful.

The next step in the analysis is to evaluate the merits of this approach. Clearly, such a model is challenging in that it essentially turns XML on its head. It then permits the end user to mandate that the XML-system as a whole also become a feed to analyze, integrate, and challenge. In short, we already have initial indications that the XML developers are not happy with XML, yet the coding continues.

If we were to apply this feed-integration model real-time we might ask, "What's a more valuable way forward" and "whose vision of the future is more compelling" and "how are their individual actions matching up to the standards applied to others."

Markets are efficient means of communicating information

We leave this for history to decide. But what is clear, is that in the end Wall Street always finds out. We saw Enron and Andersen both go down in flames despite the many early signs that there were problems and the ever-higher levels of absurdity required to ignore feedback.

So too are the many excuses to fight within the XML community. And in the end, do a disservice to the public in not simply spending time creating the tools needed to manage feeds, and in the end, serve the customer.

For it is the customer, not the program manager or the venture capitalist, that is the source of the continued future income streams. At that juncture when the public realizes that they have alternatives, that superior products are available, then the public however stupid you think they are will ultimately shift.

We have moved beyond the days of horse and buggy. But despite automobiles, we still see great battles in Afghanistan waged on horseback.

The days of absolution of standards needs to be seen for what it is: A guide that can be adjusted as our situation adjusts. There may be day when the "less superior products" actually prove to more viable, dynamic, useful in situations we had not imagined. There may times when the "least well-fitting object" is the very needed abrasion to provide real feedback on what the limits of the system are, and the need to redesign, recalibrate, and start anew.

The world is connected. It is that simple. Information moves the speed of light. But faster than that, is the passion of free people to make informed decisions. This model will ensure that the feeds provide that information, are freed of the unusable constructs, and are actually forced to interact in ways noone imagined.

We can only speculate what may happen. That is why we speculate on Wall Street. Soon, within matter of hours, the markets will open and how the market chooses to respond remains to be seen.

That day is coming. The hour is arriving. Perhaps the feeds can be persuaded to ignore the many indicators that things are not as happy and peaceful in IT land that we have been led to believe.

It remains to be seen what new questions come in from the auditors; what new probing occurs because of new insight; and to what extent feedback which might otherwise be ignored is suddenly considered material and venture capitalists, investment banks, and others suddenly remove the capital from where it is least needed, and make it available to where there are fresh ideas, solutions, and greater returns.

Yahoo

That is why I'm happy to announce my pleasure in giving public praise to the Yahoo Aggregator. It is dynamic. It works with any feed. And it has demonstrated that it will work with the customer.

What Yahoo has done is turned the industry on its head. No longer does it matter whether the feed is XML or RSS or Atom; Yahoo works with it all. However big or small. Yahoo shows it knows where the funding is coming from: The consumer, their eyeballs, and their ease in setting up the system.

It is that simple. It is about integration. Serving the public. And if you had been reading this blog over the last 2 weeks, you'd see that "feed integration" would've already told you what I've said.

That management at Yahoo is listening. That others are not.That Yahoo has system in place that works when they have not been warned. Others do not. Yahoo works. It does the job. Others fall down. The market knows when it is getting excuses.

Summary

This discussion outlined a feed integration model. It describes how feeds can be integrated, given weight, and existing standards of performance can be readjusted by a simple feed input.

Not only are feeds something that we can assemble as a linera model, but we can combine these feeds to find new patterns and let the feeds-analysis-results then adjust our own paramaters and weights.

In short, we allow the feeds to affect us, our models, and the way we look at the feed. Whether we recognize this before or after the competition is what will ultimately decide whether we remain competive or irrelevant.

Whether this system exists or not is irrelevant. The far more important issue is: Who already has this system; what outcomes are they achieving; and to what ends have we failed to see it being put to use.

Surprises are all around us. Prepare to be surprised with Yahoo. They're on the right track. And they know what needs to be done to gain both market share and margins, not simply platforms and conference notes.
Aggregation is a lovely feature of XML. Users anywhere can grab the feeds they want.

The next generation of aggregators is already in works. Its more than simply taking the feed and presenting it. The next aggregators are going to allow users to compare baseline standards, actual performance, public representations, and then identify the inconsistencies.

But that's not all. The next step is to then identify and display these deltas in a workable and usable format. One that the end user can swiftly take action on. This note outlines a structure to not only aggregate this information but outline a model developers and users can use as a starting point for a "sample case study."

It is hoped that this hypothetical example is one that serves as a common anecdote from which both the business users and information technology communities might discuss common solutions. Developers led under the direction of expert program managers have the requisite talent, skills, and tools to accomplish this objective. It remains to be seen how soon these tools gain popular support to the extent that the "common end user" like a blogger has access to these tools.

Discussion

We begin first with a description of the baseline assumptions, parameters and working environment. Let us suppose that our pseudo-model has the characteristics of being an isolated system, with simple nominal data streams, and that the number of feeds is small, finite, and closed.

Clearly, these assumptions are absurd given the nature of the internet, but for purposes of discussion it will simplify the model into one that will serve our purposes.

Let us further suppose that time takes on a new phenomena. This is to say that microseconds and nano-seconds are stretched out to months and years. Again, this will be come clear why we choose this construct.

Let us also suppose for the sake of discussion that our isolated feed-universe is characterized by five-feeds. All are coming into the aggregator. And the only focus or interest we have is the five feeds.

Also, suppose that the external world that the five feeds connect to are from separate sources, but these sources may or may not have competing interests.

In this model, we have a simple system characterized by external sources providing inputs into our single aggregator. Also, note that the input feeds are only what we are focusing on.

The purpose of this construct is to move away from the nuances of time, integration, and programmatic issues arising when we talk in terms of millions of feeds moving at nanoseconds.

In short, with this model, we are essentially taking a snap-show of a microsecond, and we're going to draw out the entire logic-trees into step functions. This is done so that we gain insight into the entire system, but only look at the smallest unit of interaction; while at the same time, we acknowledge that the time-extension is purely fabricated.

Logic

However, as you may recall from your advanced coding classes, one of the most important things to consider is the basic logic of the system. This is to say that however convoluted the end-result is, all programs and coding does exactly what it is told to do. Logic is what drives the computer. This is to say that we may extend the time-frame of our analysis to one that appears absurdly long, detailed, and ridiculous. Yet, at the micro-level, the smallest glitch in the code can derail what is otherwise a fabulous construct.

Theory aside, let's consider our model. Five feeds running in. An aggregator. And a simple notion that these five feeds can be forced to interact.

The true absurdity of the current state of affairs on the internet and in feeds is that the goal of the XML system is to ensure that data integrates. But we rarely focus on the end-users' desire to quickly integrate the individual feeds real-time into something that is a composite, new, and beyond what actually existed prior to arrival at the end-users aggregator.

In other words, this model presupposes that the aggregator moves simply from being a platform or host to assembly and display feeds, but becomes a vital tool for the business user to actually force the varying feed-elements [news, blogs, websites] to interact, compare themselves, and take on forced values.

Essentially what this model proposes is that the entire feed aggregator construct be magnified to permit the end-user to have the feeds pre-integrated, pre-packaged per standards that are more than simple filters, but are active players in identifying, prioritizing, and comparing data feeds as they come into the system architecture.

At this juncture, it is clear that aggregate information can easily be assembled at the business-level where by queries can be inputted, various data files access, and queries can be made of the system. We might choose to integrate three bodies of knowledge such as statistics, information technology, and inventory control. This combination of disciplines is not new.

At an XML level, this construct would assume that these three data streams could be also pre-mixed prior to inquiry. This is to say that the business manager could easily take a series of indicators, thresholds, and values and pre-write general codes and run routines to create new combinations. Clearly, this is possible with the state of computing.

What is new, is that the individual consumer would also have this capability when working with feeds. Again, the business user relies on individual queries to access data. However, if the business user can essentially have their varying data-fields probed in three different perspectives [three bodies of knowledge: Statistics, quality control, information technology], then this means that the "logic" behind that capability is discernible.

It is not a quantum leap to imagine feeds being integrated, queried, and tested on the basis of pre-defined user-inputs. And then have those feeds pre-arranged into new patterns, relationships, priorities along user-defined parameters [whatever those might be].

Let's get specific with an example. Let's go back to our general model and then keep in mind that we're dealing with a very elongated time-schedule, and are using only five feeds.

In this model, we might have a simple interest to view baseline data, standards, publicly available information, and then compare those with anecdotal information through other databases.

When I speak of databases in the general model, I'm also using that as an analogy for a feed in the specific case.

In a general sense, value is created when we take raw ill-formed information and create something new. Management also gains leverage when it is able to compare information in new ways and arrive more quickly at conclusions and adjust their schedules, decision points, and indicators used to assess information.

In other words, a new combination of data could very well prove insightful to identify a trend. But the issue isn't simply with "trend recognition" but also then using that "trend" as a single pseudo-feed, then observing how the other feeds interact, or not interact relative to these insights.

Again, we are not talking about creating new information at this point. Merely arriving at a general notion that "feeds" when they arrive can still remain a single feed even though they have been reformatted into a trending pattern.

The next step is the interesting part. This is when we take the aggregator, and force the non-feed-of-interest into a series of tests, analysis relative to set standards within the other feeds.

This is to say that our incoming baseline data-feed can be examined by the other feeds relative to external criteria that the original feed may or may not have been compared to.

The new aggregator-model would then rearragne the feed-data-XML into new orders and patterns relative to our baselines standards.

Non-static baselines

But this is where the trick occurs. Rather than requiring the initial feed to be compared to a static baseline, we then permit that incoming feed to challenge that baseline data to do the opposite: To use that incoming feed as a baseline to challenge the validity of the original standards.

In other words, the current models assume that the baseline assumptions and data-sets currently coded are set. But what if we take the opposite?

What if we permit [as a simple test run] the external feed to be the driver, and our pre-existing coding to be adjusted based on this feed. In other words, we arrive at a model where the platform is not longer static and firm, but one that adjusts and arrives at new trending data....which is then compared to our original data-sets which we might use as a standard.

In other words, we no longer presuppose that a single incoming feed is necessarily more right or more wrong; but we take a neutral approach. What is this feed actually telling us; and our the metrics, logic-theories, and other schema sets applied to our decision rules [already coded] whether these are valid.

Then, we permit the feed to actually take over the system, and monopolize it on the assumption [for a test case only] that all our baseline assumptions are invalid, and that this new feed is actually something that is an indicator of a trend.

At this juncture, we throw away the existing baseline and recalibrate our internal metrics to say, "What is this feed telling us" and "what do we need to do differently" on the assumption that our baseline data is incorrect.

In other words, let us suppose that our current 5 feeds, 1 of which is the "new feed data" are integrated. But that the 4 "non-new ones" are up for grabs. Must they be evaluated in terms of new information?

Again, we don't know this. Because the present platforms make the error of assuming that the existing data-sets are a baseline to measure data. But if we take the opposite view, that the "new feed" is actually the tip of the iceberg on the new trend, we arrive at a different outcome.

This is to say: We then review our existing 4 feeds in light of this new trend, and do stress testing on the very feeds were would normally ignore for purposes of analysis. Again, we are not saying that this is a valid outcome; only that the current systems available to consumers do not permit this level of sensitivity analysis.

In other words, the "baseline feeds" could then be challenged in terms of "are they correct, valid an useful" and "if we were to presume that the initial-incoming feed is the new trend, what outcomes might we anticipate when this 1 feed interfaces with the static four feeds.

The results might be not useful. Or they could actually be something that is novel. This is what makes the feed integration process noteworthy.

Application to the judicial system

Let's get specific. In the judicial system there are baseline rules and schema called statutes and laws. These form the basis of interaction. They are cues to interact.

However, let us suppose that without notice these rules change. Clearly, the public is "presumed to know the law." But suppose the decision maker or magistrate in this case is not aware of the nature of the deviation.

A static model would presuppose that this input is a deviation and must be corrected.

However, if the model presupposes that logic exists, then one might expect that there is some flexibility in how the system responds to this initial input.

The next step would be to take the "jury instructions feed" and compare those with the "observed behavior feed" and then identify the deltas. Then, we take the "hard drive informant-data feed" and compare then with the initial input. Then we identify whether there is a decision, threat, requirement to adjust, or an indicator of a wider trend.

The deviation is may not necessarily be something that is bad. But useful information signaling that the jury instructions have reached their threshold utility; and that the system oversight has failed to adequately regulate those factors that would otherwise report a false positive.

Also, this model would also provide feedback to the legislature and leadership that there was a problem, and something new is required. Not after many months of doing nothing, but that there was an immediate sign of a problem and quick action is needed.

The present models do the opposite. This is the downfall of aggregated systems whereby there's more emphasis on coding and less emphasis on the legal structure required to support that system. This is to say that a specific feed "known as statutes or jury instructions" could be dovetailed with "observed behavior feeds" and "feeds from other sources" to compare whether the deviation is a problem, or simply feedback that our system is outdated.

We do not know. But the current feeds do not do this at the consumer level. Nor do they allow the individual PC-computer user to take many feeds, compare the feeds to arbitrary standards, and then allocate and assign importance and relevance based on these baseline data.

Moreover, the current consumer feeds do not allow the consumer the opportunity to stress test the information, rearrange the elements into new patterns based on a new feed, or deviate from the set standards.

This model presupposes that the current "this is what the codes an inputs say" are static; and that the "consumer-level product" would be enhanced if these rigid rules were abandoned, whereby baseline-comparisons are not longer static, and become something that is equally malleable as the initial feed.

In short, we have feeds competing with each other: Which is most useful; which feed has greater credibility; which feeds are, if the trend is real, indicate that we must immediately abandon the existing framework.

This is not to say that rules need to be abandoned. But that we generalize the approaches and arrive at outcomes that fall well within the existing constraints. Feeds can be used to interact with each other; and we can permit the feed adjust our baseline and constructs used to both analyze the feed and also how we aggregate the individual feed relative to these other baselines.

Again, the consumer does not have this tool.

Let's consider some situations where this feed integration model, when applied, would be superior to the current feed model. Again, we're looking at the artificial feed construct, but we're maximizing our constraints so that the variables when applied are something that goes beyond simple multiple-linear regression model, and is something along the lines of quantum physics.

Multiple feed integration isn't simply laying the information out, but permitting those feeds to interact, and at times allowing a feed to take control of our baseline and stress testing that feed until our model no longer becomes useful.

Application to accounting

Let's dive into a rather arcane area of accounting. Many numbers. Lots of rules under Generally Accepted Accounting Standards. Suppose there was a system whereby a senior audit partner could aggregate all the incoming data feeds, and reshape them into a schema of her own choosing.

This is not to say that the model under GAAS is invalid; rather it is to permit the senior partner to take the incoming data, apply experience, and then dovetail these feeds with the SAS99 fraud indicators.

At this juncture, the partner could very well look at the baseline feed, and at first blush say, "We have no issues." However, as you have probably heard in re Enron-Anderson, there were indicators of problems supplied to the SEC well before the meltdown. These indicators came in the form of early reports in 1993 that the Enron corporation was having some problems. Specifically, Enron had undergone an SEC audit in the early 1980s, and the SEC was charged with a follow-up.

What happened; where were the follow-on results; and how were the subsequent "data feeds of system status" dovetailed with these "new standards placed on Enron" in light of the initial problem? We speculate that these standards were not aggressively applied; with respect to our feed model, it appears as though the "lessons learned of the feed" were not permitted to take precedence over the GAAS standards, or the senior partners planning documents.

Had a model of an "integrated feed" existed, we might have different results. Rather than dismiss the initial signs of problems or explain them away, we might have used the "unusual feed responses" and then reshape the environment in terms of "what new questions do we need to ask" and "if we fail to ask these questions in terms of the potentially new environment" what could be the risk?

This is not to say that audit sampling does not have a purpose. Rather this is to suggest that we permit, for a momentary situation, the chance to allow a single feed to take on a different role, weighting, and structure within the aggregator and then evaluate the range of expected responses once we permit this feed to take superior questions.

Let us apply this model to a systems architecture development effort. Suppose under SAS99, one of the indicators of fraud are the approaches management takes to outside auditors; and at the same time we have known, discrete reports of this conduct occurring.

However, under the existing conditions, we have yet to see how the "many press reports" are explained away; while at the same time in hindsight it all "becomes so obvious."

Applying the hypothetical model

Let's apply the model to this situation. We arrive at new questions. We are compelled to ask new things. For example, using this feed model, we might turn the inquiry on its head and then recalibrate the previous working paper in light of this new feed; and then say, "How does management explain this."

But rather than simply taking their answer and doing a follow-up, we consider their response in terms of previous observations.

Let's be specific. Suppose one audit-feed is the prior audit report. And suppose another data-feed are the management letters sent from the auditors to management. And then suppose there is a third feed which represents the information on the follow-up visit by the auditors to monitor this information and get well.

On the surface, the existing model proves valid. However, here is where things break down. No where is there a check within the system to then aggregate the follow-on results with the pre-audit feeds, and then compare the separate information gleaned by comparing the standards, with the anecdotes, and the "less optimal feed."

At first blush, we take the written information as valid; and when this is brought before a grand jury they simply rely on the evidence.

However, under this feed model, suppose we have a new way of aggregating the information. Permitting the feeds to interact. We might as, "If the management has publicly stated X, but their anecdotal practice is Y, we might assign a feed-weighting to this source that provides evidence of actual practices, as opposed to the documented information which appears to be impeachable.

Moreover, when a case goes to court the prosecutor does not necessarily end the review once final judgment is granted. What the prosecutor will do is simply continue taking anecdotal information from other sources and determine whether investigators or back-stopped informants need to be redeployed in the given area or an analogy.

Let us suppose, for the sake of argument, that a prosecutor has found material information that a public official has engaged in malfeasance, destroyed evidence, and is fabricating information. To prevail in litigation they must present this to the court. But rather than display this information for all to see, it might be useful to the prosecutor if they could take the "informant feed" and the "caselaw feed" and "audit feed" and "management resposne letter feed" to integrate them into a new pattern to discern whether one feed should be given less weight than another.

Integrating already gathered information as a separate steaming feed

In other words, although the case may have litigated and gone to trial with a successful prosecution, the current feed model does not then create a new feed out of this outcome, and then reverse itself to review the historical information in light of the new trends.

For example, such a model might give us new insight into the reasons for an ultimately departure; rather than take managements' statements for what they are, we might then ask, "Is this earlier resignation" indicative of something that was a deviation; and what work was or was not accomplished while dealing with this priority?

Again, we're taking the post-litigation feed, and then using this as the "new feed" to then reexamine the previous data, ask new questions, and arrive at a larger feed of information prompting new questions: Have we believed someone who should be second guessed; have the pre-litigation feed-inputs we have be taken in a new light; how might we reexamine the motivation of public statements or assertions given; what was the basis for hiring this manager; are there other patterns of conduct that we might observe given this new retro-look at the situation.

This is a useful tool for the public to have. Because the public could then discern new patterns not only in the existing data, but then have a backwards look on the assumed patterns that they were led to believe were correct.

In short, the public can do its own stress testing on the data inputs; evaluate whether the information gleaned is correct, valid, useful; and assign their own weighting to the feed source based on the criteria the individual consumer might choose as being more valid, useful, or given weight.

Existing models fall below customer needs

Again, the current feed aggregators do not do this. They simply take the information, display it, and then the consumer is forced to wade through it. What could be a better aggregator is something that does both forward and backward weighting of feeds; have better integration of feeds; and allow the consumer to cast aside or disregard the baseline data constructs in the existing scheme.

This is not to say that the rules are to be abandoned. Rather, it is to suggest that if we arguably live in a logical world based on reason, then feeds could reasonably be expected to integrate in far more dynamic ways.

Going forward, feeds need to be looked at in not just a new way, but in a variety of ways. Neither way is more or less correct. That feed, even if it is invalid, provides information. And the end user can make an informed decision about whether the information on the "status of that feed" needs to be adhered or cast aside.

In many cases, what we have been led to believe is a "wrong way of looking at something" is the desire by management to simply ensure the pre-scripted outcome is achieved, rather than letting the system be held to account when new information arrives indicating that the baseline standards, schema, assumptions, and ways of modeling the information prove invalid and less useful.

The next step in the analysis is to evaluate the merits of this approach. Clearly, such a model is challenging in that it essentially turns XML on its head. It then permits the end user to mandate that the XML-system as a whole also become a feed to analyze, integrate, and challenge. In short, we already have initial indications that the XML developers are not happy with XML, yet the coding continues.

If we were to apply this feed-integration model real-time we might ask, "What's a more valuable way forward" and "whose vision of the future is more compelling" and "how are their individual actions matching up to the standards applied to others."

Markets are efficient means of communicating information

We leave this for history to decide. But what is clear, is that in the end Wall Street always finds out. We saw Enron and Andersen both go down in flames despite the many early signs that there were problems and the ever-higher levels of absurdity required to ignore feedback.

So too are the many excuses to fight within the XML community. And in the end, do a disservice to the public in not simply spending time creating the tools needed to manage feeds, and in the end, serve the customer.

For it is the customer, not the program manager or the venture capitalist, that is the source of the continued future income streams. At that juncture when the public realizes that they have alternatives, that superior products are available, then the public however stupid you think they are will ultimately shift.

We have moved beyond the days of horse and buggy. But despite automobiles, we still see great battles in Afghanistan waged on horseback.

The days of absolution of standards needs to be seen for what it is: A guide that can be adjusted as our situation adjusts. There may be day when the "less superior products" actually prove to more viable, dynamic, useful in situations we had not imagined. There may times when the "least well-fitting object" is the very needed abrasion to provide real feedback on what the limits of the system are, and the need to redesign, recalibrate, and start anew.

The world is connected. It is that simple. Information moves the speed of light. But faster than that, is the passion of free people to make informed decisions. This model will ensure that the feeds provide that information, are freed of the unusable constructs, and are actually forced to interact in ways noone imagined.

We can only speculate what may happen. That is why we speculate on Wall Street. Soon, within matter of hours, the markets will open and how the market chooses to respond remains to be seen.

That day is coming. The hour is arriving. Perhaps the feeds can be persuaded to ignore the many indicators that things are not as happy and peaceful in IT land that we have been led to believe.

It remains to be seen what new questions come in from the auditors; what new probing occurs because of new insight; and to what extent feedback which might otherwise be ignored is suddenly considered material and venture capitalists, investment banks, and others suddenly remove the capital from where it is least needed, and make it available to where there are fresh ideas, solutions, and greater returns.

Yahoo

That is why I'm happy to announce my pleasure in giving public praise to the Yahoo Aggregator. It is dynamic. It works with any feed. And it has demonstrated that it will work with the customer.

What Yahoo has done is turned the industry on its head. No longer does it matter whether the feed is XML or RSS or Atom; Yahoo works with it all. However big or small. Yahoo shows it knows where the funding is coming from: The consumer, their eyeballs, and their ease in setting up the system.

It is that simple. It is about integration. Serving the public. And if you had been reading this blog over the last 2 weeks, you'd see that "feed integration" would've already told you what I've said.

That management at Yahoo is listening. That others are not.That Yahoo has system in place that works when they have not been warned. Others do not. Yahoo works. It does the job. Others fall down. The market knows when it is getting excuses.

Summary

This discussion outlined a feed integration model. It describes how feeds can be integrated, given weight, and existing standards of performance can be readjusted by a simple feed input.

Not only are feeds something that we can assemble as a linera model, but we can combine these feeds to find new patterns and let the feeds-analysis-results then adjust our own paramaters and weights.

In short, we allow the feeds to affect us, our models, and the way we look at the feed. Whether we recognize this before or after the competition is what will ultimately decide whether we remain competive or irrelevant.

Whether this system exists or not is irrelevant. The far more important issue is: Who already has this system; what outcomes are they achieving; and to what ends have we failed to see it being put to use.

Surprises are all around us. Prepare to be surprised with Yahoo. They're on the right track. And they know what needs to be done to gain both market share and margins, not simply platforms and conference notes.
" />