|
Packit |
58578d |
|
|
Packit |
58578d |
|
|
Packit |
58578d |
<html>
|
|
Packit |
58578d |
<head>
|
|
Packit |
58578d |
<meta http-equiv="Content-Language" content="en-us">
|
|
Packit |
58578d |
<meta http-equiv="Content-Type" content="text/html; charset=us-ascii">
|
|
Packit |
58578d |
<meta name="GENERATOR" content="Microsoft FrontPage 6.0">
|
|
Packit |
58578d |
<meta name="ProgId" content="FrontPage.Editor.Document">
|
|
Packit |
58578d |
<link rel="stylesheet" type="text/css" href="../../../boost.css">
|
|
Packit |
58578d |
|
|
Packit |
58578d |
<title>The Boost Statechart Library - Performance</title>
|
|
Packit |
58578d |
</head>
|
|
Packit |
58578d |
|
|
Packit |
58578d |
<body link="#0000FF" vlink="#800080">
|
|
Packit |
58578d |
|
|
Packit |
58578d |
"header">
|
|
Packit |
58578d |
|
|
Packit |
58578d |
|
|
Packit |
58578d |
|
|
Packit |
58578d |
"../../../boost.png" border="0" width="277" height="86">
|
|
Packit |
58578d |
|
|
Packit |
58578d |
|
|
Packit |
58578d |
|
|
Packit |
58578d |
The Boost Statechart Library
|
|
Packit |
58578d |
|
|
Packit |
58578d |
Performance
|
|
Packit |
58578d |
|
|
Packit |
58578d |
|
|
Packit |
58578d |
|
|
Packit |
58578d |
|
|
Packit |
58578d |
|
|
Packit |
58578d |
|
|
Packit |
58578d |
Speed versus scalability
|
|
Packit |
58578d |
tradeoffs
|
|
Packit |
58578d |
|
|
Packit |
58578d |
Memory management
|
|
Packit |
58578d |
customization
|
|
Packit |
58578d |
|
|
Packit |
58578d |
RTTI customization
|
|
Packit |
58578d |
|
|
Packit |
58578d |
Resource usage
|
|
Packit |
58578d |
|
|
Packit |
58578d |
|
|
Packit |
58578d |
|
|
Packit |
58578d |
"SpeedVersusScalabilityTradeoffs">Speed versus scalability
|
|
Packit |
58578d |
tradeoffs
|
|
Packit |
58578d |
|
|
Packit |
58578d |
Quite a bit of effort has gone into making the library fast for small
|
|
Packit |
58578d |
simple machines and scaleable at the same time (this applies only to
|
|
Packit |
58578d |
state_machine<> , there still is some room for optimizing
|
|
Packit |
58578d |
fifo_scheduler<> , especially for multi-threaded builds).
|
|
Packit |
58578d |
While I believe it should perform reasonably in most applications, the
|
|
Packit |
58578d |
scalability does not come for free. Small, carefully handcrafted state
|
|
Packit |
58578d |
machines will thus easily outperform equivalent Boost.Statechart machines.
|
|
Packit |
58578d |
To get a picture of how big the gap is, I implemented a simple benchmark in
|
|
Packit |
58578d |
the BitMachine example. The Handcrafted example is a handcrafted variant of
|
|
Packit |
58578d |
the 1-bit-BitMachine implementing the same benchmark.
|
|
Packit |
58578d |
|
|
Packit |
58578d |
I tried to create a fair but somewhat unrealistic worst-case
|
|
Packit |
58578d |
scenario:
|
|
Packit |
58578d |
|
|
Packit |
58578d |
|
|
Packit |
58578d |
For both machines exactly one object of the only event is allocated
|
|
Packit |
58578d |
before starting the test. This same object is then sent to the machines
|
|
Packit |
58578d |
over and over
|
|
Packit |
58578d |
|
|
Packit |
58578d |
The Handcrafted machine employs GOF-visitor double dispatch. The
|
|
Packit |
58578d |
states are preallocated so that event dispatch & transition amounts
|
|
Packit |
58578d |
to nothing more than two virtual calls and one pointer assignment
|
|
Packit |
58578d |
|
|
Packit |
58578d |
|
|
Packit |
58578d |
The Benchmarks - running on a 3.2GHz Intel Pentium 4 - produced the
|
|
Packit |
58578d |
following dispatch and transition times per event:
|
|
Packit |
58578d |
|
|
Packit |
58578d |
|
|
Packit |
58578d |
Handcrafted:
|
|
Packit |
58578d |
|
|
Packit |
58578d |
|
|
Packit |
58578d |
MSVC7.1: 10ns
|
|
Packit |
58578d |
|
|
Packit |
58578d |
GCC3.4.2: 20ns
|
|
Packit |
58578d |
|
|
Packit |
58578d |
Intel9.0: 20ns
|
|
Packit |
58578d |
|
|
Packit |
58578d |
|
|
Packit |
58578d |
|
|
Packit |
58578d |
1-bit-BitMachine with customized memory management:
|
|
Packit |
58578d |
|
|
Packit |
58578d |
|
|
Packit |
58578d |
MSVC7.1: 150ns
|
|
Packit |
58578d |
|
|
Packit |
58578d |
GCC3.4.2: 320ns
|
|
Packit |
58578d |
|
|
Packit |
58578d |
Intel9.0: 170ns
|
|
Packit |
58578d |
|
|
Packit |
58578d |
|
|
Packit |
58578d |
|
|
Packit |
58578d |
|
|
Packit |
58578d |
Although this is a big difference I still think it will not be
|
|
Packit |
58578d |
noticeable in most real-world applications. No matter whether an
|
|
Packit |
58578d |
application uses handcrafted or Boost.Statechart machines it will...
|
|
Packit |
58578d |
|
|
Packit |
58578d |
|
|
Packit |
58578d |
almost never run into a situation where a state machine is swamped
|
|
Packit |
58578d |
with as many events as in the benchmarks. A state machine will almost
|
|
Packit |
58578d |
always spend a good deal of time waiting for events (which typically come
|
|
Packit |
58578d |
from a human operator, from machinery or from electronic devices over
|
|
Packit |
58578d |
often comparatively slow I/O channels). Parsers are just about the only
|
|
Packit |
58578d |
application of FSMs where this is not the case. However, parser FSMs are
|
|
Packit |
58578d |
usually not directly specified on the state machine level but on a higher
|
|
Packit |
58578d |
one that is better suited for the task. Examples of such higher levels
|
|
Packit |
58578d |
are: Boost.Regex, Boost.Spirit, XML Schemas, etc. Moreover, the nature of
|
|
Packit |
58578d |
parsers allows for a number of optimizations that are not possible in a
|
|
Packit |
58578d |
general-purpose FSM framework.
|
|
Packit |
58578d |
Bottom line: While it is possible to implement a parser with this
|
|
Packit |
58578d |
library, it is almost never advisable to do so because other approaches
|
|
Packit |
58578d |
lead to better performing and more expressive code
|
|
Packit |
58578d |
|
|
Packit |
58578d |
often run state machines in their own threads. This adds considerable
|
|
Packit |
58578d |
locking and thread-switching overhead. Performance tests with the
|
|
Packit |
58578d |
PingPong example, where two asynchronous state machines exchange events,
|
|
Packit |
58578d |
gave the following times to process one event and perform the resulting
|
|
Packit |
58578d |
in-state reaction (using the library with
|
|
Packit |
58578d |
boost::fast_pool_allocator<> ):
|
|
Packit |
58578d |
|
|
Packit |
58578d |
|
|
Packit |
58578d |
Single-threaded (no locking and waiting): 840ns / 840ns
|
|
Packit |
58578d |
|
|
Packit |
58578d |
Multi-threaded with one thread (the scheduler uses mutex locking
|
|
Packit |
58578d |
but never has to wait for events): 6500ns / 4800ns
|
|
Packit |
58578d |
|
|
Packit |
58578d |
Multi-threaded with two threads (both schedulers use mutex
|
|
Packit |
58578d |
locking and exactly one always waits for an event): 14000ns /
|
|
Packit |
58578d |
7000ns
|
|
Packit |
58578d |
|
|
Packit |
58578d |
|
|
Packit |
58578d |
As mentioned above, there definitely is some room to improve the
|
|
Packit |
58578d |
timings for the asynchronous machines. Moreover, these are very crude
|
|
Packit |
58578d |
benchmarks, designed to show the overhead of locking and thread context
|
|
Packit |
58578d |
switching. The overhead in a real-world application will typically be
|
|
Packit |
58578d |
smaller and other operating systems can certainly do better in this
|
|
Packit |
58578d |
area. However, I strongly believe that on most platforms the threading
|
|
Packit |
58578d |
overhead is usually larger than the time that Boost.Statechart spends
|
|
Packit |
58578d |
for event dispatch and transition. Handcrafted machines will inevitably
|
|
Packit |
58578d |
have the same overhead, making raw single-threaded dispatch and
|
|
Packit |
58578d |
transition speed much less important
|
|
Packit |
58578d |
|
|
Packit |
58578d |
|
|
Packit |
58578d |
almost always allocate events with new and destroy them
|
|
Packit |
58578d |
after consumption. This will add a few cycles, even if event memory
|
|
Packit |
58578d |
management is customized
|
|
Packit |
58578d |
|
|
Packit |
58578d |
often use state machines that employ orthogonal states and other
|
|
Packit |
58578d |
advanced features. This forces the handcrafted machines to use a more
|
|
Packit |
58578d |
adequate and more time-consuming book-keeping
|
|
Packit |
58578d |
|
|
Packit |
58578d |
|
|
Packit |
58578d |
Therefore, in real-world applications event dispatch and transition not
|
|
Packit |
58578d |
normally constitutes a bottleneck and the relative gap between handcrafted
|
|
Packit |
58578d |
and Boost.Statechart machines also becomes much smaller than in the
|
|
Packit |
58578d |
worst-case scenario.
|
|
Packit |
58578d |
|
|
Packit |
58578d |
Detailed performance data
|
|
Packit |
58578d |
|
|
Packit |
58578d |
In an effort to identify the main performance bottleneck, the example
|
|
Packit |
58578d |
program "Performance" has been written. It measures the time that is spent
|
|
Packit |
58578d |
to process one event in different BitMachine variants. In contrast to the
|
|
Packit |
58578d |
BitMachine example, which uses only transitions, Performance uses a varying
|
|
Packit |
58578d |
number of in-state reactions together with transitions. The only difference
|
|
Packit |
58578d |
between in-state-reactions and transitions is that the former neither enter
|
|
Packit |
58578d |
nor exit any states. Apart from that, the same amount of code needs to be
|
|
Packit |
58578d |
run to dispatch an event and execute the resulting action.
|
|
Packit |
58578d |
|
|
Packit |
58578d |
The following diagrams show the average time the library spends to
|
|
Packit |
58578d |
process one event, depending on the percentage of in-state reactions
|
|
Packit |
58578d |
employed. 0% means that all triggered reactions are transitions. 100% means
|
|
Packit |
58578d |
that all triggered reactions are in-state reactions. I draw the following
|
|
Packit |
58578d |
conclusions from these measurements:
|
|
Packit |
58578d |
|
|
Packit |
58578d |
|
|
Packit |
58578d |
The fairly linear course of the curves suggests that the measurements
|
|
Packit |
58578d |
with a 100% in-state reaction ratio are accurate and not merely a product
|
|
Packit |
58578d |
of optimizations in the compiler. Such optimizations might have been
|
|
Packit |
58578d |
possible due to the fact that in the 100% case it is known at
|
|
Packit |
58578d |
compile-time that the current state will never change
|
|
Packit |
58578d |
|
|
Packit |
58578d |
The data points with 100% in-state reaction ratio and speed optimized
|
|
Packit |
58578d |
RTTI show that modern compilers seem to inline the complex-looking
|
|
Packit |
58578d |
dispatch code so aggressively that dispatch is reduced to little more
|
|
Packit |
58578d |
than it actually is, one virtual function call followed by a linear
|
|
Packit |
58578d |
search for a suitable reaction. For instance, in the case of the 1-bit
|
|
Packit |
58578d |
Bitmachine, Intel9.0 seems to produce dispatch code that is equally
|
|
Packit |
58578d |
efficient like the two virtual function calls in the Handcrafted
|
|
Packit |
58578d |
machine
|
|
Packit |
58578d |
|
|
Packit |
58578d |
On all compilers and in all variants the time spent in event dispatch
|
|
Packit |
58578d |
is dwarfed by the time spent to exit the current state and enter the
|
|
Packit |
58578d |
target state. It is worth noting that BitMachine is a flat and
|
|
Packit |
58578d |
non-orthogonal state machine, representing a close-to-worst case.
|
|
Packit |
58578d |
Real-world machines will often exit and enter multiple states during a
|
|
Packit |
58578d |
transition, what further dwarfs pure dispatch time. This makes the
|
|
Packit |
58578d |
implementation of constant-time dispatch (as requested by a few people
|
|
Packit |
58578d |
during formal review) an undertaking with little merit. Instead, the
|
|
Packit |
58578d |
future optimization effort will concentrate on state-entry and
|
|
Packit |
58578d |
state-exit
|
|
Packit |
58578d |
|
|
Packit |
58578d |
Intel9.0 seems to have problems to optimize/inline code as soon as
|
|
Packit |
58578d |
the amount of code grows over a certain threshold. Unlike with the other
|
|
Packit |
58578d |
two compilers, I needed to compile the tests for the 1, 2, 3 and 4-bit
|
|
Packit |
58578d |
BitMachine into separate executables to get good performance. Even then
|
|
Packit |
58578d |
was the performance overly bad for the 4-bit BitMachine. It was much
|
|
Packit |
58578d |
worse when I compiled all 4 tests into the same executable. This surely
|
|
Packit |
58578d |
looks like a bug in the compiler
|
|
Packit |
58578d |
|
|
Packit |
58578d |
|
|
Packit |
58578d |
Out of the box
|
|
Packit |
58578d |
|
|
Packit |
58578d |
The library is used as is, without any optimizations/modifications.
|
|
Packit |
58578d |
|
|
Packit |
58578d |
|
|
Packit |
58578d |
width="371" height="284">
|
|
Packit |
58578d |
"PerformanceNormal2.gif" border="0" width="371" height="284">
|
|
Packit |
58578d |
"PerformanceNormal3" src="PerformanceNormal3.gif" border="0" width="371"
|
|
Packit |
58578d |
height="284">
|
|
Packit |
58578d |
border="0" width="371" height="284">
|
|
Packit |
58578d |
|
|
Packit |
58578d |
Native RTTI
|
|
Packit |
58578d |
|
|
Packit |
58578d |
The library is used with
|
|
Packit |
58578d |
"configuration.html#ApplicationDefinedMacros">BOOST_STATECHART_USE_NATIVE_RTTI
|
|
Packit |
58578d |
defined.
|
|
Packit |
58578d |
|
|
Packit |
58578d |
|
|
Packit |
58578d |
width="371" height="284">
|
|
Packit |
58578d |
"PerformanceNative2.gif" border="0" width="371" height="284">
|
|
Packit |
58578d |
"PerformanceNative3" src="PerformanceNative3.gif" border="0" width="371"
|
|
Packit |
58578d |
height="284">
|
|
Packit |
58578d |
border="0" width="371" height="284">
|
|
Packit |
58578d |
|
|
Packit |
58578d |
Customized memory-management
|
|
Packit |
58578d |
|
|
Packit |
58578d |
The library is used with customized memory management
|
|
Packit |
58578d |
(boost::fast_pool_allocator ).
|
|
Packit |
58578d |
|
|
Packit |
58578d |
|
|
Packit |
58578d |
width="371" height="284">
|
|
Packit |
58578d |
"PerformanceCustom2.gif" border="0" width="371" height="284">
|
|
Packit |
58578d |
"PerformanceCustom3" src="PerformanceCustom3.gif" border="0" width="371"
|
|
Packit |
58578d |
height="284">
|
|
Packit |
58578d |
border="0" width="371" height="284">
|
|
Packit |
58578d |
|
|
Packit |
58578d |
|
|
Packit |
58578d |
|
|
Packit |
58578d |
At the heart of every state machine lies an implementation of double
|
|
Packit |
58578d |
dispatch. This is due to the fact that the incoming event and the
|
|
Packit |
58578d |
active state define exactly which
|
|
Packit |
58578d |
"definitions.html#Reaction">reaction the state machine will produce.
|
|
Packit |
58578d |
For each event dispatch, one virtual call is followed by a linear search
|
|
Packit |
58578d |
for the appropriate reaction, using one RTTI comparison per reaction. The
|
|
Packit |
58578d |
following alternatives were considered but rejected:
|
|
Packit |
58578d |
|
|
Packit |
58578d |
|
|
Packit |
58578d |
|
|
Packit |
58578d |
"http://www.objectmentor.com/resources/articles/acv.pdf">Acyclic
|
|
Packit |
58578d |
visitor: This double-dispatch variant satisfies all scalability
|
|
Packit |
58578d |
requirements but performs badly due to costly inheritance tree
|
|
Packit |
58578d |
cross-casts. Moreover, a state must store one v-pointer for each
|
|
Packit |
58578d |
reaction what slows down construction and makes memory management
|
|
Packit |
58578d |
customization inefficient. In addition, C++ RTTI must inevitably be
|
|
Packit |
58578d |
turned on, with negative effects on executable size. Boost.Statechart
|
|
Packit |
58578d |
originally employed acyclic visitor and was about 4 times slower than it
|
|
Packit |
58578d |
is now (MSVC7.1 on Intel Pentium M). The dispatch speed might be better
|
|
Packit |
58578d |
on other platforms but the other negative effects will remain
|
|
Packit |
58578d |
|
|
Packit |
58578d |
GOF
|
|
Packit |
58578d |
Visitor: The GOF Visitor pattern inevitably makes the whole machine
|
|
Packit |
58578d |
depend upon all events. That is, whenever a new event is added there is
|
|
Packit |
58578d |
no way around recompiling the whole state machine. This is contrary to
|
|
Packit |
58578d |
the scalability requirements
|
|
Packit |
58578d |
|
|
Packit |
58578d |
Single two-dimensional array of function pointers: To satisfy
|
|
Packit |
58578d |
requirement 6, it should be possible to spread a single state machine
|
|
Packit |
58578d |
over several translation units. This however means that the dispatch
|
|
Packit |
58578d |
table must be filled at runtime and the different translation units must
|
|
Packit |
58578d |
somehow make themselves "known", so that their part of the state machine
|
|
Packit |
58578d |
can be added to the table. There simply is no way to do this
|
|
Packit |
58578d |
automatically and portably. The only portable way that a state
|
|
Packit |
58578d |
machine distributed over several translation units could employ
|
|
Packit |
58578d |
table-based double dispatch relies on the user. The programmer(s) would
|
|
Packit |
58578d |
somehow have to manually tie together the various pieces of the
|
|
Packit |
58578d |
state machine. Not only does this scale badly but is also quite
|
|
Packit |
58578d |
error-prone
|
|
Packit |
58578d |
|
|
Packit |
58578d |
|
|
Packit |
58578d |
|
|
Packit |
58578d |
"MemoryManagementCustomization">Memory management customization
|
|
Packit |
58578d |
|
|
Packit |
58578d |
Out of the box, everything (event objects, state objects, internal data,
|
|
Packit |
58578d |
etc.) is allocated through std::allocator< void > (the
|
|
Packit |
58578d |
default for the Allocator template parameter). This should be satisfactory
|
|
Packit |
58578d |
for applications meeting the following prerequisites:
|
|
Packit |
58578d |
|
|
Packit |
58578d |
|
|
Packit |
58578d |
There are no deterministic reaction time (hard real-time)
|
|
Packit |
58578d |
requirements
|
|
Packit |
58578d |
|
|
Packit |
58578d |
The application will never run long enough for heap fragmentation to
|
|
Packit |
58578d |
become a problem. This is of course an issue for all long running
|
|
Packit |
58578d |
programs not only the ones employing this library. However, it should be
|
|
Packit |
58578d |
noted that fragmentation problems could show up earlier than with
|
|
Packit |
58578d |
traditional FSM frameworks
|
|
Packit |
58578d |
|
|
Packit |
58578d |
|
|
Packit |
58578d |
Should an application not meet these prerequisites, Boost.Statechart's
|
|
Packit |
58578d |
memory management can be customized as follows:
|
|
Packit |
58578d |
|
|
Packit |
58578d |
|
|
Packit |
58578d |
By passing a model of the standard Allocator concept to the class
|
|
Packit |
58578d |
templates that support a corresponding parameter
|
|
Packit |
58578d |
(event<> , state_machine<> ,
|
|
Packit |
58578d |
asynchronous_state_machine<> ,
|
|
Packit |
58578d |
fifo_scheduler<> , fifo_worker<> ).
|
|
Packit |
58578d |
This redirects all allocations to the passed custom allocator and should
|
|
Packit |
58578d |
satisfy the needs of just about any project
|
|
Packit |
58578d |
|
|
Packit |
58578d |
Additionally, it is possible to separately customize
|
|
Packit |
58578d |
state memory management by overloading operator new()
|
|
Packit |
58578d |
and operator delete() for all state classes but this is
|
|
Packit |
58578d |
probably only useful under rare circumstances
|
|
Packit |
58578d |
|
|
Packit |
58578d |
|
|
Packit |
58578d |
|
|
Packit |
58578d |
customization
|
|
Packit |
58578d |
|
|
Packit |
58578d |
RTTI is used for event dispatch and
|
|
Packit |
58578d |
state_downcast<>() . Currently, there are exactly two
|
|
Packit |
58578d |
options:
|
|
Packit |
58578d |
|
|
Packit |
58578d |
|
|
Packit |
58578d |
By default, a speed-optimized internal implementation is
|
|
Packit |
58578d |
employed
|
|
Packit |
58578d |
|
|
Packit |
58578d |
The library can be instructed to use native C++ RTTI instead by
|
|
Packit |
58578d |
defining
|
|
Packit |
58578d |
"configuration.html#ApplicationDefinedMacros">BOOST_STATECHART_USE_NATIVE_RTTI
|
|
Packit |
58578d |
|
|
Packit |
58578d |
|
|
Packit |
58578d |
There are 2 reasons to favor 2:
|
|
Packit |
58578d |
|
|
Packit |
58578d |
|
|
Packit |
58578d |
When a state machine (or parts of it) are compiled into a DLL,
|
|
Packit |
58578d |
problems could arise from the use of the internal RTTI mechanism (see the
|
|
Packit |
58578d |
FAQ item "How can I compile a state machine into a
|
|
Packit |
58578d |
dynamic link library (DLL)?"). Using option 2 is one way to work
|
|
Packit |
58578d |
around such problems (on some platforms, it seems to be the only
|
|
Packit |
58578d |
way)
|
|
Packit |
58578d |
|
|
Packit |
58578d |
State and event objects need to store one pointer less, meaning that
|
|
Packit |
58578d |
in the best case the memory footprint of a state machine object could
|
|
Packit |
58578d |
shrink by 15% (an empty event is typically 30% smaller, what can be an
|
|
Packit |
58578d |
advantage when there are bursts of events rather than a steady flow).
|
|
Packit |
58578d |
However, on most platforms executable size grows when C++ RTTI is turned
|
|
Packit |
58578d |
on. So, given the small per machine object savings, this only makes sense
|
|
Packit |
58578d |
in applications where both of the following conditions hold:
|
|
Packit |
58578d |
|
|
Packit |
58578d |
|
|
Packit |
58578d |
Event dispatch will never become a bottleneck
|
|
Packit |
58578d |
|
|
Packit |
58578d |
There is a need to reduce the memory allocated at runtime (at the
|
|
Packit |
58578d |
cost of a larger executable)
|
|
Packit |
58578d |
|
|
Packit |
58578d |
|
|
Packit |
58578d |
Obvious candidates are embedded systems where the executable resides
|
|
Packit |
58578d |
in ROM. Other candidates are applications running a large number of
|
|
Packit |
58578d |
identical state machines where this measure could even reduce the
|
|
Packit |
58578d |
overall memory footprint
|
|
Packit |
58578d |
|
|
Packit |
58578d |
|
|
Packit |
58578d |
|
|
Packit |
58578d |
|
|
Packit |
58578d |
|
|
Packit |
58578d |
Memory
|
|
Packit |
58578d |
|
|
Packit |
58578d |
On a 32-bit box, one empty active state typically needs less than 50
|
|
Packit |
58578d |
bytes of memory. Even very complex machines will usually have less
|
|
Packit |
58578d |
than 20 simultaneously active states so just about every machine should run
|
|
Packit |
58578d |
with less than one kilobyte of memory (not counting event queues).
|
|
Packit |
58578d |
Obviously, the per-machine memory footprint is offset by whatever
|
|
Packit |
58578d |
state-local members the user adds.
|
|
Packit |
58578d |
|
|
Packit |
58578d |
Processor cycles
|
|
Packit |
58578d |
|
|
Packit |
58578d |
The following ranking should give a rough picture of what feature will
|
|
Packit |
58578d |
consume how many cycles:
|
|
Packit |
58578d |
|
|
Packit |
58578d |
|
|
Packit |
58578d |
state_cast<>() : By far the most cycle-consuming
|
|
Packit |
58578d |
feature. Searches linearly for a suitable state, using one
|
|
Packit |
58578d |
dynamic_cast per visited state
|
|
Packit |
58578d |
|
|
Packit |
58578d |
State entry and exit: Profiling of the fully optimized
|
|
Packit |
58578d |
1-bit-BitMachine suggested that roughly 3 quarters of the total event
|
|
Packit |
58578d |
processing time is spent destructing the exited state and constructing
|
|
Packit |
58578d |
the entered state. Obviously, transitions where the
|
|
Packit |
58578d |
"definitions.html#InnermostCommonContext">innermost common context is
|
|
Packit |
58578d |
"far" from the leaf states and/or with lots of orthogonal states can
|
|
Packit |
58578d |
easily cause the destruction and construction of quite a few states
|
|
Packit |
58578d |
leading to significant amounts of time spent for a transition
|
|
Packit |
58578d |
|
|
Packit |
58578d |
state_downcast<>() : Searches linearly for the
|
|
Packit |
58578d |
requested state, using one virtual call and one RTTI comparison per
|
|
Packit |
58578d |
visited state
|
|
Packit |
58578d |
|
|
Packit |
58578d |
Deep history: For all innermost states inside a state passing either
|
|
Packit |
58578d |
has_deep_history or has_full_history to its
|
|
Packit |
58578d |
state base class, a binary search through the (usually small) history map
|
|
Packit |
58578d |
must be performed on each exit. History slot allocation is performed
|
|
Packit |
58578d |
exactly once, at first exit
|
|
Packit |
58578d |
|
|
Packit |
58578d |
Shallow history: For all direct inner states of a state passing
|
|
Packit |
58578d |
either has_shallow_history or has_full_history
|
|
Packit |
58578d |
to its state base class, a binary search through the (usually small)
|
|
Packit |
58578d |
history map must be performed on each exit. History slot allocation is
|
|
Packit |
58578d |
performed exactly once, at first exit
|
|
Packit |
58578d |
|
|
Packit |
58578d |
Event dispatch: One virtual call followed by a linear search for a
|
|
Packit |
58578d |
suitable reaction, using one RTTI
|
|
Packit |
58578d |
comparison per visited reaction
|
|
Packit |
58578d |
|
|
Packit |
58578d |
Orthogonal states: One additional virtual call for each exited state
|
|
Packit |
58578d |
if there is more than one active leaf state before a transition.
|
|
Packit |
58578d |
It should also be noted that the worst-case event dispatch time is
|
|
Packit |
58578d |
multiplied in the presence of orthogonal states. For example, if two
|
|
Packit |
58578d |
orthogonal leaf states are added to a given state configuration, the
|
|
Packit |
58578d |
worst-case time is tripled
|
|
Packit |
58578d |
|
|
Packit |
58578d |
|
|
Packit |
58578d |
|
|
Packit |
58578d |
|
|
Packit |
58578d |
"../../../doc/images/valid-html401.png" alt="Valid HTML 4.01 Transitional"
|
|
Packit |
58578d |
height="31" width="88">
|
|
Packit |
58578d |
|
|
Packit |
58578d |
Revised
|
|
Packit |
58578d |
03 December, 2006
|
|
Packit |
58578d |
|
|
Packit |
58578d |
Copyright © 2003-2006
|
|
Packit |
58578d |
Andreas Huber Dönni
|
|
Packit |
58578d |
|
|
Packit |
58578d |
Distributed under the Boost Software License, Version 1.0. (See
|
|
Packit |
58578d |
accompanying file LICENSE_1_0.txt or
|
|
Packit |
58578d |
copy at
|
|
Packit |
58578d |
"http://www.boost.org/LICENSE_1_0.txt">http://www.boost.org/LICENSE_1_0.txt)
|
|
Packit |
58578d |
</body>
|
|
Packit |
58578d |
</html>
|