In probabilistic programming languages, model construction and inference are separate tasks. The former is done by the language users and the latter by the language developers. Unfettered by concerns about inference, modelers will want to create big, complex probabilistic programs, which will present familiar problems: Models may become too big to comprehend, debug or validate as wholes. Users building large applications will need to divide and coordinate work among several people and will need to reuse common elements of models without rebuilding them. We hypothesize that model creation will benefit from two linguistic tools that have proven helpful in deterministic programming: types and modules.
With Insomnia, we hope to provide the best of two worlds: to help with the “democratization of machine learning,” we provide a language with abstractions for modular development of complex probabilistic programs. To help with analysis and understanding of models’ properties, we define the meaning of modular models by elaborating them into a small (module-free) core calculus whose semantics may be studied further.
Extended Abstract (PDF)
Insomnia (prototype and examples)
Aleksey Kliger and Sean Stromsten
We define a core calculus for the purpose of investigating reasoning principles of probabilistic programming languages. By using a variation of a technique called higher-order abstract syntax (HOAS), which is common in the implementation of domain-specific languages, the calculus captures the semantics of a stochastic language with observation while being agnostic to the details of its deterministic portions. By remaining agnostic to the non-stochastic portions of the language, this style of semantics enables the discovery of general reasoning principles for the principled manipulation of probabilistic program fragments by programmers, compilers, and analysis tools. This generality allows us to reason about probabilistic program fragments without the need to resort to the underlying measure theory in every instance, by instead enabling reasoning in terms of the core calculus in a way that we believe to be applicable to various surface-level languages.
Extended abstract (PDF)
Theo Giannakopoulos (BAE Systems) and Mitchell Wand & Andrew Cobb (Northeastern University)
The goal of this paper is to show how semantics can be defined for a real-world probabilistic programming language. Figaro has had numerous applications in areas such as malware analysis, climate modeling, vehicle health maintenance, and intelligence analysis.
The Semantics of Figaro, an Embedded Probabilistic Programming Language
Talk slides on Figaro semantics
Avi Pfeffer and Brian Ruttenberg, Charles River Analytics
Authors: Johannes Borgström, Ugo Dal Lago, Andrew D. Gordon, Marcin Szymczak
We develop the operational semantics of a probabilistic lambda- calculus with continuous distributions, as a foundation for universal probabilistic programming languages such as Church, Anglican, and Venture. Our first contribution is to adapt the classic operational semantics of lambda-calculus to the continuous case, via creating a measure space on terms and defining step-indexed approximations. We prove equivalence of big-step and small-step formulations of this distributional semantics. Our second contribution is to formalize the implementation technique of trace MCMC for our calculus and to show correctness. A key step is defining a sampling semantics of a term as a function from a trace of random samples to a value, and showing that the distribution induced by integrating over all traces equals the distributional semantics. Another step is defining sufficient conditions for the distribution induced by trace MCMC to converge to the distributional semantics. To the best of our knowledge, this is the first rigorous correctness proof for trace MCMC for a higher-order functional language.
Welcome to PPS, workshop on probabilistic programming semantics, on Saturday, 23 January 2016, colocated with POPL. This informal workshop aims to bring programming-language and machine-learning researchers together to advance the semantic foundations of probabilistic programming.
As listed on the posted schedule, we have accepted 20 extended abstracts submitted by a wide range of contributors. We accepted 10 submissions as posters and 10 as talks, not on the basis of reviewer scores but based on which medium we thought would be most effective in conveying the material. So, some highly ranked submissions that are more technical in nature are accepted as posters.
To foster collaboration and establish common ground, we asked all accepted contributors to post their revised extended abstracts on this site, along with any other materials such as preprints they want to share.
Everyone is welcome to post comments, questions, and other discussion on the posts. Because probabilistic programming is a research area that bridges multiple communities with different vocabularies, comments of the flavor “I don’t understand what you mean by X” are particularly valuable!