Le séminaire a lieu le lundi, à 11h, en salle 316 du bâtiment de Métrologie B. Vous trouverez ci-dessous le planning du séminaire de Probabilités-Statistique pour l’année universitaire en cours.
Contacts : yacouba.boubacar_mainassara univ-fcomte.fr ou romain.biard univ-fcomte.fr.
Exposés passés :
16 juin : Thibault Modeste
(Université Lyon CLaude Bernard)
Scoring and validation of dynamic probability forecast
Abstract :
Forecast and its evaluation are major task in statistic. In real applications, forecast often take the form of a dynamic process evolving over time and this sequential point of view must be taken into account. We propose and discuss a minimal framework for dynamic probability forecast and its evaluation. Proper scoring rules are a crucial concept for probability forecast evaluation and we show, under minimal assumptions, that they can still be used in the dynamic framework because they are minimized, in the sense of the long term average score, by the ideal forecast. Another strategy for forecast evaluation is calibration theory based on the probability integral transform. Here ideal forecast is characterized by conditional calibration and we present some new tests based on regression trees that we compare to the ones proposed by Straehl and Ziegel (EJS 2017) in the framework of cross-calibration.
17 mai : François Gaston Ged
(École polytechnique fédérale de Lausanne)
Introduction au Modèle de Moran avec sélection.
Abstract :
Je présenterai quelques modèles classiques de Génétique des Populations, à savoir le modèle de Moran (à taille fixée), le coalescent de Kingman décrivant sa généalogie ainsi que la diffusion de Wright-Fisher, obtenue comme limite d’échelle du modèle de Moran lorsque la taille de la population tend vers l’infini. Nous verrons comment cette limite est affectée lorsqu’un méchanisme de sélection faible est intégré au modèle de Moran bi-allélique. Nous comparerons ces résultats à ceux récents de Schweinsberg [1,2] qui étudie un modèle de Moran avec sélection forte. Dans ce cas, la généalogie est décrite par le coalescent de Bolthausen-Sznitman. Je présenterai alors une sophistication du modèle étudié par Schweinsberg qui incluera un méchanisme de sélection faible en plus, et je décrirai la limite d’échelle de ce processus.
Bien que l’exposé sera principalement introductif, je garderai du temps pour les questions techniques.
[1] Rigorous results for a population model with selection I : evolution of the fitness distribution. Electron. J. Probab. 22 (2017), no. 37, 1-94. Paper
[2] Rigorous results for a population model with selection II : genealogy of the population. Electron. J. Probab. 22 (2017), no. 38, 1-54. Paper
10 mai : Paul Thévenin
(Université d’Uppsala)
Arbres, laminations et fragmentations aléatoires.
Abstract :
Les processus de fragmentation décrivent l’évolution d’un objet qui se scinde au cours du temps en objets de masses plus petites. Dans le cas d’un arbre, cela consiste à effacer ses arêtes l’une après l’autre, et à considérer les tailles des composantes connexes ainsi définies.
J’exposerai des résultats récents concernant le comportement asymptotique de ce processus, lorsque l’arbre et l’ordre dans lequel on efface ses arêtes sont aléatoires.
Grâce à un codage de ce processus par un ensemble de cordes du disque unité, j’expliquerai également un lien entre la fragmentation d’un arbre et un modèle de permutations.
3 mai : Chifaa Dahik
(FEMTO, Univ. Bougogne Franche-Comté)
Robust Shortest Path Problem in presence of uncertainty
Abstract :
We address a specific class of combinatorial problems with correlated cost coefficients belonging to an ellipsoidal uncertainty set. An absolute robust problem in these settings is a well-known NP-Hard problem. To tackle this problem, we propose a heuristic approach based on the Frank-Wolfe (FW) algorithm. In our approach, we take a radically different perspective on FW by looking at the exploration power of the integer inner iterates of the method. Experimental tests have been realized for the robust shortest path problem. Comparisons with the optimal solution given by the mixed integer second order cone programming solver of CPLEX have also been provided. Our findings are illustrated by comprehensive numerical experiments.
26 avril : Benedetta Cavalli
(Université de Zurich)
A probabilistic view on the long-time behaviour of growth-fragmentation equations with bounded fragmentation rates
Abstract :
The growth-fragmentation equation models systems of particles that grow and reproduce as time passes. An important question concerns the asymptotic behaviour of its solutions. This question has traditionally been addressed using analytic techniques such as entropy methods or splitting of operators.
Bertoin and Watson (2018) developed a probabilistic approach relying on the Feynman-Kac formula, that enabled them to answer to this question in the case in which the growth rate is sublinear and the mass is conserved at fragmentation events. This assumption on the growth ensures that microscopic particles remain microscopic.
In this talk, we present a recent work of the speaker, in which we go further in the analysis, assuming bounded fragmentations and allowing arbitrarily small particles to reach macroscopic mass in finite time. Moreover, we drop the hypothesis of conservation of mass when a fragmentation occurs. With the Feynman-Kac approach, we establish necessary and sufficient conditions on the coefficients of the equation that ensure the so-called Malthusian behaviour with exponential speed of convergence to the asymptotic profile. Furthermore, we provide an explicit expression of the latter.
29 mars : Verónica Miró Pina
(Centre for Genomic Regulation, Barcelona)
Le modèle de Wright-Fisher avec efficacité.
Abstract :
On considère une population avec des ressources limitées et deux types d’individus : les "inefficaces", qui ont besoin d’une grande quantité de ressources pour survivre et se reproduire, et les "efficaces", qui peuvent survivre avec moins de ressources. On montre que la stratégie "inefficace" facilite la fixation de mutations avantageuses, et donc cette stratégie peut être favorisée.
Pour cela, on considère le processus qui correspond à la fréquence d’individus inefficaces dans la population et on montre qu’il converge vers une diffusion qui généralise la diffusion classique de Wright-Fisher.
On s’intéresse aussi à la généalogie d’une telle population, qui peut-être modélisée non pas par un arbre mais par un graphe dont on caractérise la loi.
22 mars : Dan Daniel Erdmann-Pham
(University of California, Berkeley)
Hydrodynamics of the inhomogeneous l-TASEP and its Application to Protein Synthesis
Abstract :
The inhomogeneous l-TASEP is an interacting particle process wherein particles stochastically enter, unidirectionally traverse, and finally exit a one-dimensional lattice segment at rates that may depend on a particle’s location within the lattice. Its homogeneous version is known to exhibit various phase transitions in macroscopic observables like particle density and current, with fluctuations governed by what is known as the KPZ equation. In this talk, we begin to extend such results to the inhomogeneous setting by developing the so-called hydrodynamic limit, which governs the system dynamics on an LLN-type scale. If time permits, we apply our results to elucidate the key determinants of protein synthesis, which motivated the introduction of TASEP fifty years ago. This is based on joint work with Khanh Dao Duc and Yun S. Song.
1 mars : Landy Rabehasaina
(LmB, Univ. Bougogne Franche-Comté)
Multitype branching process with nonhomogeneous Poisson and contagious Poisson immigration
Abstract :
In a multitype branching process, it is assumed that immigrants arrive according to a nonhomogeneous Poisson or a contagious Poisson process (both processes are formulated as a nonhomogeneous birth process with an appropriate choice of transition intensities). We show that the normalized numbers of objects of the various types alive at time for supercritical, critical, and subcritical cases jointly converge in distribution under those two different arrival processes. Furthermore, some transient expectation results when there are only two types of particles are provided.
22 février : Quentin Klopfenstein
(IMB, Univ. Bougogne Franche-Comté)
Implicit differentiation for fast hyperparameter selection in non-smooth convex learning
Abstract :
Most modern machine learning models require one hyperparameter to be chosen by the user upstream of the learning phase. Popular approaches use a grid of values on which to evaluate the performance of the model for a given criterion, one can think of grid-search or random-search which means fitting the given model for each value of the grid.
These methods have a major drawback : they scale exponentially with the number of hyperparameters. In this presentation, we will show that the hyperparameter selection problem can be cast as a bilevel optimization problem and will consider non-smooth models (such as the Lasso, the Elastic Net, the SVM).
We propose a first-order method that uses information about the gradient with respect to the hyperparameter to automatically select the best hyperparameter for a given criterion.
We will see that this method is very efficient even when the number of hyperparameters gets large.
11 janvier : Davit Varron
(LmB, Univ. Bougogne Franche-Comté)
Mesure empirique et valeurs extrêmes : une représentation par des lois conditionnelles
Abstract :
Dans de nombreuses méthodes statistiques utilisées en valeurs extrêmes, on construit un estimateur à partir d’un sous échantillon de l’échantillon initial. Ce sous échantillon sélectionnes les observations pour lesquelles dépasse sa ième statistique d’ordre. Nous montrons que ces méthodes peuvent être vues comme des images de mesures aléatoires qui se comportent comme des mesure empiriques si on les conditionne correctement. Travail en collaboration avec Dr Benjamin Bobbia et Pr Clément dombry.
14 décembre : Aboubacar Y. Touré
(LmB, Univ. Bougogne Franche-Comté)
On general exponential weight functions and variation phenomenon
Abstract :
General weighted exponential distributions including modified exponential ones are widely used with great ability in statistical applications, particularly in reliability. In this work, we investigate full exponential weight functions and their extensions from any nonnegative continuous reference weighted distribution. Several properties and their
connections with the recent variation phenomenon are then established. In particular, characterizations, weightening operations and dual distributions are set forward.
Illustrative examples and concluding remarks are extensively discussed.
7 décembre : Benjamin Bobbia
(LmB, Univ. Bougogne Franche-Comté)
Titre : Extreme quantile regression : a coupling approach and Wasserstein distance
Abstract :
Résumé : In this work, we develop two coupling approaches for extreme quantile regression. We consider i.i.d copies of and and we want an estimation of the conditional quantile of given of order for a very small .
We introduce the proportional tail model, strongly inspired by the heteroscedastic extremes developped by Einmahl, de Haan and Zhou, where has an heavy tail with extreme value index and the conditional tails are asymptotically equivalent to . We propose and study estimators of both model parameters and conditional quantile with are studied by coupling methods.
The first method is based on coupling of empirical processes while the second is related with optimal transport.
Even if we establish the asymptotic normality of parameters estimators with both methods, the first is focused on the proper quantile estimation whereas the second is more focused on the estimation of in presence of bias and the elaboration of a validation procedure for our model.
Moreover, we also develop the optimal coupling approach in the general case of univariate extreme value theory.
30 novembre : Mehdi Dagdoug
(LmB, Univ. Bougogne Franche-Comté)
Titre : Model-assisted estimation through random forests in finite population sampling
Abstract :
Résumé : Estimation of finite population totals is of primary interest in survey sampling. Often, additional auxiliary information is available at the population level. The model-assisted approach uses this supplementary source of information to construct improved estimators of finite population totals by assuming a model between the survey variable and the potential predictors. In this work, new classes of model-assisted estimators based on random forests are suggested.
Under mild conditions, the proposed estimators are shown to be asymptotically design unbiased and consistent. Their asymptotic variance is derived, and a consistent variance estimator is suggested. The asymptotic distribution of the estimators is obtained, allowing for the use of normal-based confidence intervals. The high-dimensional behavior of the random forest estimator is also investigated and compared to commonly used estimators. Simulations illustrate that the proposed model is particularly efficient and can outperform state-of-the-art estimators, especially in complex settings such as small sample sizes, high-dimensional regressor space or complex superpopulation models. This is a joint work with Camelia Goga and David Haziza.
9 novembre : Clément Dombry
(LmB, Univ. Bougogne Franche-Comté)
Titre : Behavior of linear L2-boosting algorithms in the vanishing learning rate asymptotic
Abstract :
Résumé : We investigate the asymptotic behaviour of gradient boosting algorithms when the learning rate converges to zero and the number of iterations is rescaled accordingly. We mostly consider L2-boosting for regression with linear base learner as studied in Bühlmann and Yu (2003) and analyze also a stochastic version of the model where subsampling is used at each step (Friedman, 2002). We prove a deterministic
limit in the vanishing learning rate asymptotic and characterize the limit as the unique solution of a linear differential equation in an infinite dimensional function space. Besides, the training and test error of the limiting procedure are thoroughly analyzed.
Joint work with Youssef Esstafa.
21 septembre : Jean-Jil Duchamps
(LmB, Univ. Bougogne Franche-Comté)
La forêt de Moran
Abstract :
On considère la forêt aléatoire obtenue sous la distribution stationnaire de la chaîne de Markov suivante sur les graphes sur : à chaque étape, un sommet choisi uniformément est déconnecté de ses voisins et reconnecté à un autre sommet choisi uniformément. Cette forêt aléatoire correspond aux liens de parenté direct entre individus dans une population évoluant selon un modèle classique (modèle de
Moran). Elle admet une construction très simple que j’expliciterai, qui permet de révéler les liens intéressants qu’elle présente avec l’arbre uniforme sur , ainsi qu’avec les "uniform attachment trees". Je donnerai aussi certaines de ses caractéristiques : loi des degrés, d’un arbre uniforme, taille du plus grand degré/arbre (travail en collaboration avec F. Bienvenu et F. Foutel-Rodier).