Microsoft announced earlier this week that a CEP/stream processing product will be included in SQL 2008 R2. Complex Event Processing, or CEP, is primarily an event processing concept that deals with the task of processing multiple events with the goal of identifying the meaningful events within the event cloud. CEP employs techniques such as detection of complex patterns of many events, event correlation and abstraction, event hierarchies, and relationships between events such as causality, membership, and timing, and event-driven processes.
Microsoft called out four reasons to me why CEP might be needed in addition to ordinary database processing. Two are the standard reasons for data reduction:
1. Without CEP, you can’t bang the data into the database fast enough.
2. You don’t want to keep most of the data past a short time window anyway.
The other two are also fairly standard reasons for using CEP:
3. Standard SQL isn’t all that great for time series anyway.
4. CEP use cases often call for incremental processing and/or parameterization of queries, something CEP engines are commonly better designed for than are DBMS.
However, Microsoft seems to be taking a somewhat different approach to time-based SQL extensions than some other vendors. To quote email Microsoft sent today:
Microsoft Research (MSR) introduced the temporal extensions to relational algebra based upon a notion of application time that is independent of system time. It matters when the event originated instead of when they arrived at the processing system. Further it treats each event as being associated with an interval of time as opposed to a point in time. This helps in modeling certain real life phenomenon naturally. [StreamBase et al.] also reason about multiple streams. Both the approaches are extensions to relational algebra. The MSR approach took the algebra as the starting point while StreamBase took an existing language over the algebra – SQL as the starting point. The MSR approach consequently avoids having to rework other elements of the SQL surface. The primary language extensions through which this algebra will be exposed initially is LINQ.
What are the implications of this? Can we use the CEP algorithm to monitor real time data from the cloud and extract only the necessary data to our datawarehouse ? or am i going to far with this ?