10
10.0

Jun 28, 2018
06/18

by
Nathalie Peyrard; Simon de Givry; Alain Franc; Stéphane Robin; Régis Sabbadin; Thomas Schiex; Matthieu Vignes

texts

#
eye 10

#
favorite 0

#
comment 0

Probabilistic graphical models offer a powerful framework to account for the dependence structure between variables, which can be represented as a graph. The dependence between variables may render inference tasks such as computing normalizing constant, marginalization or optimization intractable. The objective of this paper is to review techniques exploiting the graph structure for exact inference borrowed from optimization and computer science. They are not yet standard in the statistician...

Topics: Statistics, Artificial Intelligence, Computing Research Repository, Machine Learning, Learning

Source: http://arxiv.org/abs/1506.08544

18
18

Jun 28, 2018
06/18

by
Sarah Ouadah; Stéphane Robin; Pierre Latouche

texts

#
eye 18

#
favorite 0

#
comment 0

The degrees are a classical and relevant way to study the topology of a network. They can be used to assess the goodness-of-fit for a given random graph model. In this paper we introduce goodness-of-fit tests for two classes of models. First, we consider the case of independent graph models such as the heterogeneous Erd\"os-R\'enyi model in which the edges have different connection probabilities. Second, we consider a generic model for exchangeable random graphs called the W-graph. The...

Topics: Statistics, Statistics Theory, Mathematics

Source: http://arxiv.org/abs/1507.08140

34
34

Sep 23, 2013
09/13

by
Antoine Channarond; Jean-Jacques Daudin; Stéphane Robin

texts

#
eye 34

#
favorite 0

#
comment 0

The Stochastic Block Model (Holland et al., 1983) is a mixture model for heterogeneous network data. Unlike the usual statistical framework, new nodes give additional information about the previous ones in this model. Thereby the distribution of the degrees concentrates in points conditionally on the node class. We show under a mild assumption that classification, estimation and model selection can actually be achieved with no more than the empirical degree data. We provide an algorithm able to...

Source: http://arxiv.org/abs/1110.6517v1

3
3.0

Jun 30, 2018
06/18

by
Pierre Barbillon; Mathieu Thomas; Isabelle Goldringer; Frédéric Hospital; Stéphane Robin

texts

#
eye 3

#
favorite 0

#
comment 0

Dynamic extinction colonisation models (also called contact processes) are widely studied in epidemiology and in metapopulation theory. Contacts are usually assumed to be possible only through a network of connected patches. This network accounts for a spatial landscape or a social organisation of interactions. Thanks to social network literature, heterogeneous networks of contacts can be considered. A major issue is to assess the influence of the network in the dynamic model. Most work with...

Topics: Populations and Evolution, Other Statistics, Quantitative Biology, Statistics

Source: http://arxiv.org/abs/1404.4287

3
3.0

Jun 30, 2018
06/18

by
Elisabetta Bonafede; Franck Picard; Stéphane Robin; Cinzia Viroli

texts

#
eye 3

#
favorite 0

#
comment 0

Next-generation sequencing technologies now constitute a method of choice to measure gene expression. Data to analyze are read counts, commonly modeled using Negative Binomial distributions. A relevant issue associated with this probabilistic framework is the reliable estimation of the overdispersion parameter, reinforced by the limited number of replicates generally observable for each gene. Many strategies have been proposed to estimate this parameter, but when differential analysis is the...

Topics: Statistics, Methodology

Source: http://arxiv.org/abs/1410.8093

2
2.0

Jun 30, 2018
06/18

by
Souhil Chakar; Émilie Lebarbier; Céline Lévy-Leduc; Stéphane Robin

texts

#
eye 2

#
favorite 0

#
comment 0

We consider the problem of multiple change-point estimation in the mean of a Gaussian AR(1) process. Taking into account the dependence structure does not allow us to use the dynamic programming algorithm, which is the only algorithm giving the optimal solution in the independent case. We propose a robust estimator of the autocorrelation parameter, which is consistent and satisfies a central limit theorem. Then, we propose to follow the classical inference approach, by plugging this estimator...

Topics: Mathematics, Statistics Theory, Statistics, Methodology

Source: http://arxiv.org/abs/1403.1958

2
2.0

Jun 30, 2018
06/18

by
Julien Chiquet; Mahendra Mariadassou; Stéphane Robin

texts

#
eye 2

#
favorite 0

#
comment 0

Many application domains such as ecology or genomics have to deal with multivariate non Gaussian observations. A typical example is the joint observation of the respective abundances of a set of species in a series of sites, aiming to understand the co-variations between these species. The Gaussian setting provides a canonical way to model such dependencies, but does not apply in general. We consider here the multivariate exponential family framework for which we introduce a generic model with...

Topics: Statistics, Methodology

Source: http://arxiv.org/abs/1703.06633

2
2.0

Jun 30, 2018
06/18

by
Julien Chiquet; Tristan Mary-Huard; Stéphane Robin

texts

#
eye 2

#
favorite 0

#
comment 0

Conditional Gaussian graphical models (cGGM) are a recent reparametrization of the multivariate linear regression model which explicitly exhibits $i)$ the partial covariances between the predictors and the responses, and $ii)$ the partial covariances between the responses themselves. Such models are particularly suitable for interpretability since partial covariances describe strong relationships between variables. In this framework, we propose a regularization scheme to enhance the learning...

Topics: Statistics, Methodology

Source: http://arxiv.org/abs/1403.6168

35
35

Sep 23, 2013
09/13

by
Cécile Durot; François Koladjo; Sylvie Huet; Stéphane Robin

texts

#
eye 35

#
favorite 0

#
comment 0

Non-parametric estimation of a convex discrete distribution may be of interest in several applications, such as the estimation of species abundance distribution in ecology. In this paper we study the least squares estimator of a discrete distribution under the constraint of convexity. We show that this estimator exists and is unique, and that it always outperforms the classical empirical estimator in terms of the $\ell_{2}$-distance. We provide an algorithm for its computation, based on the...

Source: http://arxiv.org/abs/1202.6263v1

4
4.0

Jun 30, 2018
06/18

by
Catherine Matias; Stéphane Robin

texts

#
eye 4

#
favorite 0

#
comment 0

We present a selective review on probabilistic modeling of heterogeneity in random graphs. We focus on latent space models and more particularly on stochastic block models and their extensions that have undergone major developments in the last five years.

Topics: Mathematics, Statistics Theory, Statistics, Methodology

Source: http://arxiv.org/abs/1402.4296

2
2.0

Jun 29, 2018
06/18

by
Loïc Schwaller; Stéphane Robin

texts

#
eye 2

#
favorite 0

#
comment 0

We consider the problem of change-point detection in multivariate time-series. The multivariate distribution of the observations is supposed to follow a graphical model, whose graph and parameters are affected by abrupt changes throughout time. We demonstrate that it is possible to perform exact Bayesian inference whenever one considers a simple class of undirected graphs called spanning trees as possible structures. We are then able to integrate on the graph and segmentation spaces at the same...

Topics: Machine Learning, Statistics

Source: http://arxiv.org/abs/1603.07871

57
57

Sep 21, 2013
09/13

by
Alice Cleynen; Michel Koskas; Emilie Lebarbier; Guillem Rigaill; Stephane Robin

texts

#
eye 57

#
favorite 0

#
comment 0

Genome annotation is an important issue in biology which has long been addressed with gene prediction methods and manual experiments requiring biological expertise. The expanding Next Generation Sequencing technologies and their enhanced precision allow a new approach to the domain: the segmentation of RNA-Seq data to determine gene boundaries. Because of its almost linear complexity, we propose to use the Pruned Dynamic Programming Algorithm, which performances had been acknowledged for CGH...

Source: http://arxiv.org/abs/1204.5564v3

14
14

Jun 28, 2018
06/18

by
Pierre Latouche; Stéphane Robin; Sarah Ouadah

texts

#
eye 14

#
favorite 0

#
comment 0

Logistic regression is a natural and simple tool to understand how covariates contribute to explain the topology of a binary network. Once the model fitted, the practitioner is interested in the goodness-of-fit of the regression in order to check if the covariates are sufficient to explain the whole topology of the network and, if they are not, to analyze the residual structure. To address this problem, we introduce a generic model that combines logistic regression with a network-oriented...

Topics: Methodology, Statistics, Computation

Source: http://arxiv.org/abs/1508.00286

51
51

Sep 21, 2013
09/13

by
Mahendra Mariadassou; Stéphane Robin; Corinne Vacher

texts

#
eye 51

#
favorite 0

#
comment 0

As more and more network-structured data sets are available, the statistical analysis of valued graphs has become common place. Looking for a latent structure is one of the many strategies used to better understand the behavior of a network. Several methods already exist for the binary case. We present a model-based strategy to uncover groups of nodes in valued graphs. This framework can be used for a wide span of parametric random graphs models and allows to include covariates. Variational...

Source: http://arxiv.org/abs/1011.1813v1

51
51

Sep 22, 2013
09/13

by
Guillem Rigaill; Emilie Lebarbier; Stéphane Robin

texts

#
eye 51

#
favorite 0

#
comment 0

In segmentation problems, inference on change-point position and model selection are two difficult issues due to the discrete nature of change-points. In a Bayesian context, we derive exact, non-asymptotic, explicit and tractable formulae for the posterior distribution of variables such as the number of change-points or their positions. We also derive a new selection criterion that accounts for the reliability of the results. All these results are based on an efficient strategy to explore the...

Source: http://arxiv.org/abs/1004.4347v1

38
38

Sep 21, 2013
09/13

by
Caroline Bérard; Marie-Laure Martin-Magniette; Véronique Brunaud; Sébastien Aubourg; Stéphane Robin

texts

#
eye 38

#
favorite 0

#
comment 0

Tiling arrays make possible a large scale exploration of the genome thanks to probes which cover the whole genome with very high density until 2 000 000 probes. Biological questions usually addressed are either the expression difference between two conditions or the detection of transcribed regions. In this work we propose to consider simultaneously both questions as an unsupervised classification problem by modeling the joint distribution of the two conditions. In contrast to previous methods,...

Source: http://arxiv.org/abs/1104.5429v1

43
43

Jul 20, 2013
07/13

by
Alain Celisse; Stéphane Robin

texts

#
eye 43

#
favorite 0

#
comment 0

In the multiple testing context, a challenging problem is the estimation of the proportion $\pi_0$ of true-null hypotheses. A large number of estimators of this quantity rely on identifiability assumptions that either appear to be violated on real data, or may be at least relaxed. Under independence, we propose an estimator $\hat{\pi}_0$ based on density estimation using both histograms and cross-validation. Due to the strong connection between the false discovery rate (FDR) and $\pi_0$, many...

Source: http://arxiv.org/abs/0804.1189v1

43
43

Sep 23, 2013
09/13

by
Stevenn Volant; Marie-Laure Martin Magniette; Stéphane Robin

texts

#
eye 43

#
favorite 0

#
comment 0

We consider a binary unsupervised classification problem where each observation is associated with an unobserved label that we want to retrieve. More precisely, we assume that there are two groups of observation: normal and abnormal. The `normal' observations are coming from a known distribution whereas the distribution of the `abnormal' observations is unknown. Several models have been developed to fit this unknown distribution. In this paper, we propose an alternative based on a mixture of...

Source: http://arxiv.org/abs/1105.0760v1

12
12

Jun 27, 2018
06/18

by
Eleni Ioanna Delatola; Emilie Lebarbier; Tristan Mary-Huard; François Radvanyi; Stéphane Robin; Jennifer Wong

texts

#
eye 12

#
favorite 0

#
comment 0

Motivation: Detecting local correlations in expression between neighbor genes along the genome has proved to be an effective strategy to identify possible causes of transcriptional deregulation in cancer. It has been successfully used to illustrate the role of mechanisms such as copy number variation (CNV) or epigenetic alterations as factors that may significantly alter expression in large chromosomic regions (gene silencing or gene activation). Results: The identification of correlated...

Topics: Methodology, Statistics

Source: http://arxiv.org/abs/1504.05738

17
17

Jun 27, 2018
06/18

by
Xavier Collilieux; Emilie Lebarbier; Stéphane Robin

texts

#
eye 17

#
favorite 0

#
comment 0

We consider the segmentation of set of correlated time-series due e.g. to some spatial structure. We propose to model the between-series dependency with a factor model. This modeling allows us to use the dynamic programming algorithm for the inference of the breakpoints, which remains the most efficient strategy. We also propose a model selection procedure to determine both the number of breakpoints and the number of factors. The performance of our proposed procedure is assessed through...

Topics: Methodology, Statistics

Source: http://arxiv.org/abs/1505.05660