The correspondence between accounts or profiles (i.e., network nodes) and real-world identities varies greatly from social network to social network. A wired telephone may be shared by a family or an office, while mobile phones are much more likely to belong to a single person. Some online social networks such as Facebook attempt to ensure that accounts accurately reflect real-world information , while others such as MySpace are notoriously lax . Fake MySpace profiles have been created for pets and celebrities, and a user may create multiple profiles with contradictory or fake information.
In this paper, we eschew an explicit notion of identity and focus instead on entities, which are simply sources of social-network profile information that are consistent across different networks and service providers. In most cases, an entity is associated with a real-world person, but does not have to be (e.g., consider a political campaign which has a YouTube account and a Twitter account). The concept of entities also allows us to capture information which is characteristic of a user across multiple networks--for example, an unusual username--but is not related to anything in the real world.
In our model, nodes are purely collections of their attributes, and to identify a node simply means to learn the entity to which the node belongs, whether this entity is a single person, a group, or an organization. We assume that correctly associating a node with the corresponding entity constitutes a breach of anonymity. The question of whether the entity is a single individual or not is extraneous to our model.
Arvind Narayanan 2009-03-19