How can experiments be more systematic and comparable?
Participants:
- Stefan Appel
- Arnd Schröter
- Jun Ma
- Nejm Saadallah
- Amir Malekpour
- Anh Tuan Nguyen
- Weixiong Rao
Guiding Questions
- Where can we obtain realistic workloads and data sets?
- What benchmarks exist today and should exist in the future?
- How have other communities developed and adopted benchmarks and data sets?
- What are realistic models for workload generation?
- What are good performance metrics?
- What is a solid evaluation methodology?
Techniques
Simulators
Emulators
Real Deployment
|
Pros |
Cons |
Simulation |
Debugging |
Adaptation of Applications |
|
Flexibility |
Implementation details overlooked |
|
Large-scale possible |
|
|
Reproducibility |
|
|
Simple result collection |
|
Emulation |
No adaptation of apps necessary |
Still not realistic |
|
Medium-scale experiements |
Medium-scale experiements |
Real Deployment |
No adaptation of apps necessary |
No reproducibility |
|
Realistic |
Debugging |
|
|
Control of parameters |
|
|
Not large-scale |
Measures and Metrics
Black-Box - Client side
- Correctness
- Completeness (No false negatives)
- Accuracy (No positives)
- Ordering
- End-to-end delay of notifications and subscriptions
- Throughput
- System behavior while overloaded
- Gray-Box - Operator view
- Load balancing
- Memory consumption
- Utilization of brokers and links
Workload and Datasets
- Distribution of clients
- Mobility of clients
- Publication and subscription characteristics
Generated
- Patterns of events, subscriptions, ...
Real data
- from websites
- companies
Benchmarks
Benchmark: SPECjms2007
- The used workload is described in "Performance evaluation of message-oriented middleware using the SPECjms2007 benchmark" by Kai Sachs, Samuel Kounev, Jean Bacon, Alejandro Buchmann (Performance Evaluation, Vol. 66, Elsevier Science, 2009)
SPECjvm2007 was also extended to evaluate for example purely pub/sub capabilities of JMS MOM. It was presented as a Demo at SIGMETRICS2009 and can be found here.
Other data
Physical networks, e.g. BRITE
Summary of breakout session discussion
Please summarize the points from the break out sessions here.
