Measuring the effect of perturbation

The Jaccard Coefficient can be used to measure the amount of perturbation introduced to the sanitized graph $S_{\textsf {san}}$ during the release process:

\begin{displaymath}\frac{\sum_{u\in V_{\textsf {san}}}\nu(u)JC(u)}{\sum_{u\in V_{\textsf {san}}}\nu(u)}\end{displaymath}

where $\nu(u)$ is the centrality of the node $u$ and the Jaccard Coefficient $JC(u)$ is defined in this context as follows:

\frac{\vert\{v \in \tilde{V} \colon (E(u,v) \wedge \tilde{E}...
...,v) \vee \tilde{E}(u,v) \vee E(v,u) \vee

$\mbox{ where } \tilde{V} = V_{\textsf {san}} \mbox{ and } \tilde{E} = E_{\textsf {san}}$. In the above expression, the numerator counts the number of edges that are left unchanged in $E_{\textsf {san}}$, taking directionality into account. The denominator counts all edges that exist in either direction in either $E$, or $E_{\textsf {san}}$.

A more obvious measure that simply counts the number of edges added or removed, as a fraction of the total number of edges, would ignore the effect of perturbation on individual nodes. By contrast, our measure takes this into account, weighing nodes in proportion to their centrality in the network (this is the purpose of the $\nu$ factor).

Arvind Narayanan 2009-03-19