Reasoning about floating-point arithmetic is notoriously hard. While static and dynamic analysis techniques or program repair have made significant progress, more work is still needed to make them relevant to real-world code. On the critical path to that goal is understanding what real-world floating-point code looks like. To close that knowledge gap, this paper presents the first large-scale empirical study of floating-point arithmetic usage in statically typed languages across public GitHub repositories. We follow state-of the art mining practices including random sampling and filtering based on only intrinsic properties to avoid bias, and identify floating-point usage by searching for keywords in the source code, and programming language constructs (e.g., loops) by parsing the code. Our evaluation supports the claim often made in papers that floating-point arithmetic is widely used. Comparing statistics such as size and usage of certain constructs and functions, we find that benchmarks used in literature to evaluate automated reasoning techniques for floating-point arithmetic are in certain aspects representative of 'real-world' code, but not in all. We aim for our study and dataset to help future techniques for floating-point arithmetic to be designed and evaluated to match actual users' expectations.
Simon Guilloud, Sankalp Gambhir, Andrea Gilot, and Viktor Kuncak
Abstract
We present a mechanized embedding of higher-order logic (HOL) and algebraic data types (ADTs) into first-order logic with ZFC axioms. Our approach interprets types as sets, with function (arrow) types coinciding with set-theoretic function spaces. We assume traditional FOL syntax without notation for term-level binders. To embed λ-terms, we define a notion of context, defining the closure of all abstractions occuring inside a term. We implement the embedding in the Lisa proof assistant for schematic first-order logic and its library based on axiomatic set theory (presented at ITP 2023). We show how to implement type checking and the proof steps of HOL Light as proof-producing tactics in Lisa. The embedded HOL theorems and proofs are interoperable with the existing Lisa library. This yields a form of soft type system supporting top-level polymorphism and ADTs within set theory. The approach offers tools for Lisa users to carry HOL-style proofs within set theory. It also enables the import of HOL Light theorem statements into Lisa, as well as the replay of small HOL Light kernel proofs.
Work Experience
Research Intern, LARA (EPFL)July 2024
Supervised by Prof. Viktor Kuncak
Continued my Master thesis, a mechanization of inductive datatypes inside a first order logic theorem prover equipped with the axioms of set theory.
Research Intern, Digital AssetMarch-August 2023
Supervised by Dr. Andreas Lochbihler
Proof formalization and mechanization of an optimization of a decentralized protocol adopted by financial institutions including Deutsche Börse, HKMA, Goldman Sachs, and BNP Paribas.
Developpement of a graphical interface in Java and a client-server architecture in C to drive a data acquisition system.
Teaching
Thesis supervision, Uppsala University
Master theses:
Mario Tannous - Specification Inference for Scala and Stainless
Bachelor theses:
Carl Martinsson - Code Maintainability and Data Structures in Energy-Aware Software: An Empirical GitHub Study
(co-supervised with
Jennifer Gross)
Samuel Sverker - Energy-Aware Software Development in Operating Systems: A Thematic Analysis of OS-Level Optimization Techniques
(co-supervised with
Jennifer Gross)
Valentin Hassinen - Field Testing Stainless by Verifying Daisy's AST Library
Teaching Assistant, Uppsala University
1DL201 Program Design and Data Structures (2024, 2025)
1DL042 Programming (2025)
Student Assistant, EPFL
CS-550 Formal Verification (2022, 2023)
CS-251 Theory of Computation (2022)
CS-233 Introduction to Machine Learning (2021, 2022)