Publications about Benchmarking
Articles in journal or book chapters
-
Reliable Benchmarking: Requirements and Solutions.
International Journal on Software Tools for Technology Transfer (STTT), 21(1):1-29,
2019.
doi:10.1007/s10009-017-0469-y
Keyword(s): Benchmarking
Publisher's Version
PDF
Presentation
Supplement
Abstract
Benchmarking is a widely used method in experimental computer science, in particular, for the comparative evaluation of tools and algorithms. As a consequence, a number of questions need to be answered in order to ensure proper benchmarking, resource measurement, and presentation of results, all of which is essential for researchers, tool developers, and users, as well as for tool competitions. We identify a set of requirements that are indispensable for reliable benchmarking and resource measurement of time and memory usage of automatic solvers, verifiers, and similar tools, and discuss limitations of existing methods and benchmarking tools. Fulfilling these requirements in a benchmarking framework can (on Linux systems) currently only be done by using the cgroup and namespace features of the kernel. We developed BenchExec, a ready-to-use, tool-independent, and open-source implementation of a benchmarking framework that fulfills all presented requirements, making reliable benchmarking and resource measurement easy. Our framework is able to work with a wide range of different tools, has proven its reliability and usefulness in the International Competition on Software Verification, and is used by several research groups worldwide to ensure reliable benchmarking. Finally, we present guidelines on how to present measurement results in a scientifically valid and comprehensible way.BibTeX Entry
@article{Benchmarking-STTT, author = {Dirk Beyer and Stefan L{\"o}we and Philipp Wendler}, title = {Reliable Benchmarking: {R}equirements and Solutions}, journal = {International Journal on Software Tools for Technology Transfer (STTT)}, volume = {21}, number = {1}, pages = {1--29}, year = {2019}, doi = {10.1007/s10009-017-0469-y}, sha256 = {a50fbc212af394b32166d6354f986e7b1d5bc87220bdc50df899d6a46fedf33c}, url = {https://www.sosy-lab.org/research/benchmarking/}, presentation = {https://www.sosy-lab.org/research/prs/Latest_ReliableBenchmarking.pdf}, abstract = {Benchmarking is a widely used method in experimental computer science, in particular, for the comparative evaluation of tools and algorithms. As a consequence, a number of questions need to be answered in order to ensure proper benchmarking, resource measurement, and presentation of results, all of which is essential for researchers, tool developers, and users, as well as for tool competitions. We identify a set of requirements that are indispensable for reliable benchmarking and resource measurement of time and memory usage of automatic solvers, verifiers, and similar tools, and discuss limitations of existing methods and benchmarking tools. Fulfilling these requirements in a benchmarking framework can (on Linux systems) currently only be done by using the cgroup and namespace features of the kernel. We developed BenchExec, a ready-to-use, tool-independent, and open-source implementation of a benchmarking framework that fulfills all presented requirements, making reliable benchmarking and resource measurement easy. Our framework is able to work with a wide range of different tools, has proven its reliability and usefulness in the International Competition on Software Verification, and is used by several research groups worldwide to ensure reliable benchmarking. Finally, we present guidelines on how to present measurement results in a scientifically valid and comprehensible way.}, keyword = {Benchmarking}, _pdf = {https://www.sosy-lab.org/research/pub/2019-STTT.Reliable_Benchmarking_Requirements_and_Solutions.pdf}, annote = {Publication appeared first online in November 2017<BR/> BenchExec is available at: <a href="https://github.com/sosy-lab/benchexec"> https://github.com/sosy-lab/benchexec</a>}, }Additional Infos
Publication appeared first online in November 2017
BenchExec is available at: https://github.com/sosy-lab/benchexec -
Beiträge zu praktikabler Prädikatenanalyse.
In S. Hölldobler, editors,
Ausgezeichnete Informatikdissertationen 2017,
LNI,
pages 261-270,
2018.
Gesellschaft für Informatik (GI).
Keyword(s): Benchmarking, CPAchecker, Software Model Checking Publisher's Version
PDF
Presentation
Supplement
Abstract
Der Stand der Forschung im Bereich der automatischen Software-Verifikation ist fragmentiert. Verschiedene Verfahren existieren nebeneinander in unterschiedlichen Darstellungen und mit wenig Bezug zueinander, aussagekräftige Vergleiche sind selten. Die Dissertation adressiert dieses Problem. Ein konfigurierbares und flexibles Rahmenwerk zur Vereinheitlichung solcher Verfahren wird entwickelt und mehrere vorhandene Verfahren werden in diesem Rahmenwerk ausgedrückt. Dies bringt neue Erkenntnisse über die Kernideen dieser Verfahren, ermöglicht experimentelle Studien in einer neuartigen Qualität, und erleichtert die Forschung an Kombinationen und Weiterentwicklungen dieser Verfahren. Die Implementierung dieses Rahmenwerks im erfolgreichen Verifizierer CPAchecker wird in der bisher größten derartigen experimentellen Studie (120 verschiedene Konfigurationen, 671280 Ausführungen) evaluiert. Hierzu wird ein Benchmarking-System präsentiert, das mit Hilfe moderner Technologien signifikante qualitative Messfehler existierender Systeme vermeidet.BibTeX Entry
@incollection{DissZusammenfassungWendler, author = {Philipp Wendler}, title = {Beitr{\"{a}}ge zu praktikabler Pr{\"{a}}dikatenanalyse}, booktitle = {Ausgezeichnete Informatikdissertationen 2017}, editor = {S. H{\"{o}}lldobler}, volume = {{D-18}}, pages = {261-270}, year = {2018}, series = {{LNI}}, publisher = {Gesellschaft f{\"{u}}r Informatik ({GI})}, isbn = {978-3885799771}, url = {https://www.sosy-lab.org/research/phd/wendler/}, pdf = {https://dl.gi.de/bitstream/handle/20.500.12116/19476/invited_paper_4.pdf?sequence=1&isAllowed=y}, presentation = {https://www.sosy-lab.org/research/prs/2018-05-08_GiDiss_BeitraegeZuPraktikablerPraedikatenanalyse.pdf}, abstract = {Der Stand der Forschung im Bereich der automatischen Software-Verifikation ist fragmentiert. Verschiedene Verfahren existieren nebeneinander in unterschiedlichen Darstellungen und mit wenig Bezug zueinander, aussagekräftige Vergleiche sind selten. Die Dissertation adressiert dieses Problem. Ein konfigurierbares und flexibles Rahmenwerk zur Vereinheitlichung solcher Verfahren wird entwickelt und mehrere vorhandene Verfahren werden in diesem Rahmenwerk ausgedrückt. Dies bringt neue Erkenntnisse über die Kernideen dieser Verfahren, ermöglicht experimentelle Studien in einer neuartigen Qualität, und erleichtert die Forschung an Kombinationen und Weiterentwicklungen dieser Verfahren. Die Implementierung dieses Rahmenwerks im erfolgreichen Verifizierer CPAchecker wird in der bisher größten derartigen experimentellen Studie (120 verschiedene Konfigurationen, 671280 Ausführungen) evaluiert. Hierzu wird ein Benchmarking-System präsentiert, das mit Hilfe moderner Technologien signifikante qualitative Messfehler existierender Systeme vermeidet.}, keyword = {Benchmarking,CPAchecker,Software Model Checking}, annote = {This is a German summary of the dissertation <a href="https://www.sosy-lab.org/research/bib/Year/2017.complete.html#PhilippPredicateAnalysis">Towards Practical Predicate Analysis</a>.}, doifalse = {20.500.12116/19476}, urlpub = {https://dl.gi.de/handle/20.500.12116/19476}, }Additional Infos
This is a German summary of the dissertation Towards Practical Predicate Analysis.
Articles in conference or workshop proceedings
-
CPU Energy Meter: A Tool for Energy-Aware Algorithms Engineering.
In Proceedings of the 26th International Conference on
Tools and Algorithms for the Construction and Analysis of Systems (TACAS 2020, Dublin, Ireland, April 25-30), part 2,
LNCS 12079,
pages 126-133,
2020.
Springer.
doi:10.1007/978-3-030-45237-7_8
Keyword(s): Benchmarking
Publisher's Version
PDF
Presentation
Video
Supplement
Abstract
Verification algorithms are among the most resource-intensive computation tasks. Saving energy is important for our living environment and to save cost in data centers. Yet, researchers compare the efficiency of algorithms still in terms of consumption of CPU time (or even wall time). Perhaps one reason for this is that measuring energy consumption of computational processes is not as convenient as measuring the consumed time and there is no sufficient tool support. To close this gap, we contribute CPU Energy Meter, a small tool that takes care of reading the energy values that Intel CPUs track inside the chip. In order to make energy measurements as easy as possible, we integrated CPU Energy Meter into BenchExec, a benchmarking tool that is already used by many researchers and competitions in the domain of formal methods. As evidence for usefulness, we explored the energy consumption of some state-of-the-art verifiers and report some interesting insights, for example, that energy consumption is not necessarily correlated with CPU time.BibTeX Entry
@inproceedings{TACAS20b, author = {Dirk Beyer and Philipp Wendler}, title = {CPU Energy Meter: A Tool for Energy-Aware Algorithms Engineering}, booktitle = {Proceedings of the 26th International Conference on Tools and Algorithms for the Construction and Analysis of Systems (TACAS~2020, Dublin, Ireland, April 25-30), part 2}, pages = {126-133}, year = {2020}, series = {LNCS~12079}, publisher = {Springer}, doi = {10.1007/978-3-030-45237-7_8}, sha256 = {c5c8ad06f4b192e61799469a8fc6ca4661714aa2945e0ce07363a376ff06dcd7}, url = {https://www.sosy-lab.org/research/energy-measurement/}, presentation = {https://www.sosy-lab.org/research/prs/2021-03-31_TACAS20_CPU-Energy-Meter_Dirk.pdf}, abstract = {Verification algorithms are among the most resource-intensive computation tasks. Saving energy is important for our living environment and to save cost in data centers. Yet, researchers compare the efficiency of algorithms still in terms of consumption of CPU time (or even wall time). Perhaps one reason for this is that measuring energy consumption of computational processes is not as convenient as measuring the consumed time and there is no sufficient tool support. To close this gap, we contribute CPU Energy Meter, a small tool that takes care of reading the energy values that Intel CPUs track inside the chip. In order to make energy measurements as easy as possible, we integrated CPU Energy Meter into BenchExec, a benchmarking tool that is already used by many researchers and competitions in the domain of formal methods. As evidence for usefulness, we explored the energy consumption of some state-of-the-art verifiers and report some interesting insights, for example, that energy consumption is not necessarily correlated with CPU time.}, keyword = {Benchmarking}, video = {https://youtu.be/qzKAoBVTw2c}, } -
Benchmarking and Resource Measurement.
In B. Fischer and
J. Geldenhuys, editors,
Proceedings of the 22nd International Symposium on
Model Checking of Software (SPIN 2015, Stellenbosch, South Africa, August 24-26),
LNCS 9232,
pages 160-178,
2015.
Springer-Verlag, Heidelberg.
doi:10.1007/978-3-319-23404-5_12
Keyword(s): Benchmarking
Publisher's Version
PDF
Supplement
Abstract
Proper benchmarking and resource measurement is an important topic, because benchmarking is a widely-used method for the comparative evaluation of tools and algorithms in many research areas. It is essential for researchers, tool developers, and users, as well as for competitions. We formulate a set of requirements that are indispensable for reproducible benchmarking and reliable resource measurement of automatic solvers, verifiers, and similar tools, and discuss limitations of existing methods and benchmarking tools. Fulfilling these requirements in a benchmarking framework is complex and can (on Linux) currently only be done by using the cgroups feature of the kernel. We provide BenchExec, a ready-to-use, tool-independent, and free implementation of a benchmarking framework that fulfills all presented requirements, making reproducible benchmarking and reliable resource measurement easy. Our framework is able to work with a wide range of different tools and has proven its reliability and usefulness in the International Competition on Software Verification.BibTeX Entry
@inproceedings{SPIN15a, author = {Dirk Beyer and Stefan L{\"o}we and Philipp Wendler}, title = {Benchmarking and Resource Measurement}, booktitle = {Proceedings of the 22nd International Symposium on Model Checking of Software (SPIN~2015, Stellenbosch, South Africa, August 24-26)}, editor = {B.~Fischer and J.~Geldenhuys}, pages = {160-178}, year = {2015}, series = {LNCS~9232}, publisher = {Springer-Verlag, Heidelberg}, isbn = {978-3-319-23403-8}, doi = {10.1007/978-3-319-23404-5_12}, url = {https://www.sosy-lab.org/research/benchmarking/}, pdf = {https://www.sosy-lab.org/research/pub/2015-SPIN.Benchmarking_and_Resource_Measurement.pdf}, abstract = {Proper benchmarking and resource measurement is an important topic, because benchmarking is a widely-used method for the comparative evaluation of tools and algorithms in many research areas. It is essential for researchers, tool developers, and users, as well as for competitions. We formulate a set of requirements that are indispensable for reproducible benchmarking and reliable resource measurement of automatic solvers, verifiers, and similar tools, and discuss limitations of existing methods and benchmarking tools. Fulfilling these requirements in a benchmarking framework is complex and can (on Linux) currently only be done by using the cgroups feature of the kernel. We provide BenchExec, a ready-to-use, tool-independent, and free implementation of a benchmarking framework that fulfills all presented requirements, making reproducible benchmarking and reliable resource measurement easy. Our framework is able to work with a wide range of different tools and has proven its reliability and usefulness in the International Competition on Software Verification.}, keyword = {Benchmarking}, annote = {An <a href="https://www.sosy-lab.org/research/bib/Year/2017.complete.html#Benchmarking-STTT">extended version</a> of this article appeared in STTT.}, }Additional Infos
An extended version of this article appeared in STTT.
Theses and projects (PhD, MSc, BSc, Project)
-
Developing a Verifier Based on Parallel Portfolio with .
Bachelor's Thesis, LMU Munich, Software Systems Lab,
2022.
Keyword(s): Benchmarking
PDF
Presentation
BibTeX Entry
@misc{KleinertParPortfolioCoVeriTeam, author = {Tobias Kleinert}, title = {Developing a Verifier Based on Parallel Portfolio with \textsc{CoVeriTeam}}, year = {2022}, pdf = {https://www.sosy-lab.org/research/bsc/2022.Kleinert.Parallel_Portfolio_CoVeriTeam.pdf}, presentation = {https://www.sosy-lab.org/research/prs/2022-03-16_BA_Parallel_Portfolio_CoVeriTeam.pdf}, keyword = {Benchmarking}, field = {Computer Science}, howpublished = {Bachelor's Thesis, LMU Munich, Software Systems Lab}, } -
Cgroups v2 Support for .
Bachelor's Thesis, LMU Munich, Software Systems Lab,
2022.
Keyword(s): Benchmarking
PDF
Presentation
BibTeX Entry
@misc{GlosterCgroupsV2, author = {Robin Gloster}, title = {Cgroups v2 Support for \textsc{BenchExec}}, year = {2022}, pdf = {https://www.sosy-lab.org/research/bsc/2022.Gloster.Cgroups_v2_Support_for_BenchExec.pdf}, presentation = {https://www.sosy-lab.org/research/prs/2022-03-09_BA_Cgroups_v2_Support_for_BenchExec_Gloster.pdf}, keyword = {Benchmarking}, field = {Computer Science}, howpublished = {Bachelor's Thesis, LMU Munich, Software Systems Lab}, } -
Shareable Benchmarking Reports with Enhanced Filters and Dynamic Statistics for .
Bachelor's Thesis, LMU Munich, Software Systems Lab,
2021.
Keyword(s): Benchmarking
PDF
Presentation
BibTeX Entry
@misc{SimonBA, author = {Dennis Simon}, title = {Shareable Benchmarking Reports with Enhanced Filters and Dynamic Statistics for \textsc{BenchExec}}, year = {2021}, pdf = {https://www.sosy-lab.org/research/bsc/2021.Simon.Shareable_Benchmarking_Reports_with_Enhanced_Filters_and_Dynamic_Statistics_for_BenchExec.pdf}, presentation = {https://www.sosy-lab.org/research/prs/2021-04-28_BA_ShareableBenchmarkingReportsWithEnhancedFiltersAndDynamicStatisticsForBenchExec_Simon.pdf}, keyword = {Benchmarking}, field = {Computer Science}, howpublished = {Bachelor's Thesis, LMU Munich, Software Systems Lab}, } -
Energy Consumption Prediction of Verification Work.
Bachelor's Thesis, LMU Munich, Software Systems Lab,
2020.
Keyword(s): CPAchecker, Benchmarking, Energy Measurement
BibTeX Entry
@misc{IsaakidisEnergy, author = {Petros Isaakidis}, title = {Energy Consumption Prediction of Verification Work}, year = {2020}, keyword = {CPAchecker, Benchmarking, Energy Measurement}, field = {Computer Science}, howpublished = {Bachelor's Thesis, LMU Munich, Software Systems Lab}, } -
Measuring and Optimizing Energy Consumption of Verification Work on Clusters.
Bachelor's Thesis, LMU Munich, Software Systems Lab,
2019.
Keyword(s): Benchmarking, Energy Measurement
PDF
Presentation
BibTeX Entry
@misc{HailerEnergy, author = {Maximilian Hailer}, title = {Measuring and Optimizing Energy Consumption of Verification Work on Clusters}, year = {2019}, pdf = {https://www.sosy-lab.org/research/bsc/2019.Hailer.Measuring_and_Optimizing_Energy_Consumption_of_Verification_Work_on_Clusters.pdf}, presentation = {https://www.sosy-lab.org/research/prs/2019-10-30_BA_MeasuringAndOptimizingEnergyConsumptionOfVerificationWork_Hailer.pdf}, keyword = {Benchmarking, Energy Measurement}, field = {Computer Science}, howpublished = {Bachelor's Thesis, LMU Munich, Software Systems Lab}, } -
Modern Architecture and Improved UI for Tables of BenchExec.
Bachelor's Thesis, LMU Munich, Software Systems Lab,
2019.
Keyword(s): Benchmarking
PDF
Presentation
BibTeX Entry
@misc{BschorTables, author = {Laura Bschor}, title = {Modern Architecture and Improved {UI} for Tables of {{\sc BenchExec}}}, year = {2019}, pdf = {https://www.sosy-lab.org/research/bsc/2019.Bschor.Modern_Architecture_and_Improved_UI_for_Tables_of_BenchExec.pdf}, presentation = {https://www.sosy-lab.org/research/prs/2019-11-06_BA_ModernArchitectureAndImprovedUIforTablesOfBenchExec_Bschor.pdf}, keyword = {Benchmarking}, field = {Computer Science}, howpublished = {Bachelor's Thesis, LMU Munich, Software Systems Lab}, } -
Towards Practical Predicate Analysis.
PhD Thesis, University of Passau, Software Systems Lab,
2017.
Keyword(s): Benchmarking, CPAchecker, Software Model Checking Publisher's Version
PDF
Presentation
Supplement
BibTeX Entry
@misc{PhilippPredicateAnalysis, author = {Philipp Wendler}, title = {Towards Practical Predicate Analysis}, year = {2017}, url = {https://www.sosy-lab.org/research/phd/wendler/}, pdf = {https://www.sosy-lab.org/research/phd/2017.Wendler.Towards_Practical_Predicate_Analysis.pdf}, presentation = {https://www.sosy-lab.org/research/prs/2017-11-20_RigorosumWendler_TowardsPracticalPredicateAnalysis.pdf}, keyword = {Benchmarking,CPAchecker,Software Model Checking}, annote = {Nominated for the <a href="https://gi.de/aktuelles/wettbewerbe/dissertationspreis/">Dissertation award 2017</a> of the German <a href="https://gi.de/">Gesellschaft für Informatik (GI)</a>}, howpublished = {PhD Thesis, University of Passau, Software Systems Lab}, urn = {urn:nbn:de:bvb:739-opus4-5098}, }Additional Infos
Nominated for the Dissertation award 2017 of the German Gesellschaft für Informatik (GI) -
Measuring, Visualizing, and Optimizing the Energy Consumption of Computer Clusters.
Bachelor's Thesis, University of Passau, Software Systems Lab,
2017.
Keyword(s): Benchmarking
PDF
Supplement
BibTeX Entry
@misc{SteingerMeasuring, author = {Nils Steinger}, title = {Measuring, Visualizing, and Optimizing the Energy Consumption of Computer Clusters}, year = {2017}, url = {https://www.sosy-lab.org/research/bsc/steinger}, pdf = {https://www.sosy-lab.org/research/bsc/2017.Steinger.Measuring,_Visualizing,_and_Optimizing_the_Energy_Consumption_of_Computer_Clusters.pdf}, keyword = {Benchmarking}, field = {Computer Science}, howpublished = {Bachelor's Thesis, University of Passau, Software Systems Lab}, }
Disclaimer:
This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All person copying this information are expected to adhere to the terms and constraints invoked by each author's copyright. In most cases, these works may not be reposted without the explicit permission of the copyright holder.