An S4 class to represent a Stochastic Block Model with a sampling scheme for missing data, extend icl_model-class
.
Such model can be used to cluster graph vertex, and model a square adjacency matrix \(X\) with the following generative model :
$$ \pi \sim Dirichlet(\alpha)$$
$$ Z_i \sim \mathcal{M}(1,\pi)$$
$$ \theta_{kl} \sim Beta(a_0,b_0)$$
$$ X_{ij}|Z_{ik}Z_{jl}=1 \sim \mathcal{B}(\theta_{kl})$$
Missing value are supposed to be generated afterwards, the observation process correspond to a binary matrix \(R\) of the same size as \(X\),
with \(R_{ij}=1\) for observed entries and \(R_{ij}=0\) for missing ones. \(R\) may be supposed to be MAR:
$$ \epsilon \sim Beta(a_{0obs},b_{0obs})$$
$$ R_{ij} \sim \mathcal{B}(\epsilon)$$
this correspond to the "dyad" sampling scheme. But the sampling scheme can also be NMAR:
$$ \epsilon_{kl} \sim Beta(a_{0obs},b_{0obs})$$
$$ R_{ij}|Z_{ik}Z_{jl}=1 \sim \mathcal{B}(\epsilon_{kl})$$
this correspond to the "block-dyad" sampling scheme.
This class mainly store the prior parameters value \(\alpha,a_0,b_0,a_{0obs},b_{0obs}\) of this generative model in the following slots:
name
name of the model
alpha
Dirichlet over cluster proportions prior parameter (default to 1)
a0
Beta prior parameter over links (default to 1)
b0
Beta prior parameter over no-links (default to 1)
type
define the type of networks (either "directed" or "undirected", default to "directed")
sampling
define the sampling process (either "dyad" or "block-dyad" )
sampling_priors
define the sampling process priors parameters (list with a0obs
and b0obs
fields.)
Nowicki, Krzysztof and Tom A B Snijders (2001). “Estimation and prediction for stochastic block structures”. In:Journal of the American statistical association 96.455, pp. 1077–1087
new("misssbm")#> An object of class "misssbm" #> Slot "sampling": #> [1] "block-dyad" #> #> Slot "sampling_priors": #> $a0obs #> [1] 1 #> #> $b0obs #> [1] 1 #> #> #> Slot "a0": #> [1] 1 #> #> Slot "b0": #> [1] 1 #> #> Slot "type": #> [1] "directed" #> #> Slot "name": #> [1] "misssbm" #> #> Slot "alpha": #> [1] 1 #>#> An object of class "misssbm" #> Slot "sampling": #> [1] "dyad" #> #> Slot "sampling_priors": #> $a0obs #> [1] 2 #> #> $b0obs #> [1] 1 #> #> #> Slot "a0": #> [1] 0.5 #> #> Slot "b0": #> [1] 0.5 #> #> Slot "type": #> [1] "directed" #> #> Slot "name": #> [1] "misssbm" #> #> Slot "alpha": #> [1] 0.5 #>sbm = rsbm(100,c(0.5,0.5),diag(2)*0.1+0.01) sbm$x[cbind(base::sample(1:100,50),base::sample(1:100,50))]=NA sol = greed(sbm$x,model=new("misssbm",sampling="dyad"))#> ------- directed MISSSBM with dyad sampling model fitting ------ #> ################# Generation 1: best solution with an ICL of -2419 and 2 clusters ################# #> ################# Generation 2: best solution with an ICL of -2419 and 2 clusters ################# #> ------- Final clustering ------- #> ICL clustering with a MISSSBM model, 2 clusters and an icl of -2419.