Representation Learning in Continuous-Time Dynamic Signed Networks

Kelvin Jose
9 min readNov 25, 2024

--

This is a review of the paper Representation Learning in Continuous-Time Dynamic Signed Networks which proposes a new architecture named SEMBA to capture the dynamics of evolving singed networks.

Introduction

We have been actively discussing about continous-time dynamic networks or in other words temporal graphs for sometime now and this is the 5th installment in this series.

  1. Transforming Data: The rise of Temporal Graphs
  2. Continuous-Time Dynamic Network Embeddings
  3. Embedding Temporal Network via Neighborhood Formation
  4. Direct Embedding of Temporal Network Edges via Time-Decayed Line Graphs

In all of the above posts, we saw how useful and effective are temporal graphs in modeling information that evolve in time, be it social interactions, online purchases, web activities etc. In contrast with a static graph, it keeps track of timestamps of interactions that occur in its context to account the temporal evolution of nodes and edges in it. There are many ways in which a graph can evolve - new nodes may join or leave the graph, edges appear or disappear between nodes and features of nodes and edges can be updated over time. Along with some of the attributes of a temporal graph, this study focuses on a property called a sign which denotes friendly or conflicting relationships that occur in a graph among different nodes or entities. In a signed network (more on this below), each edge will be annotated with a sign, either positive (+) or negative (+) based on the characters of the nodes present in both ends of the edge.

Authors propose a new Temporal Graph Network (TGN) based approach to model dynamic signed networks, named SEMBA. The core idea is to use the sign information in association with balance theory to form messages as seen in TGN which are then aggregated and stored in an intermediate memory location corresponding to each node. This memory will later be used to generate temporal and sign aware representations of the nodes.

Signed Networks

Signed Networks are primarily designed to capture the evolution of polarization in social platforms because the relationships between the users often reflect a mixture of positive and negative interactions (Jure Leskovec et. al.). Each node in the graph represents a user or an entity and edges connecting them are assigned either “+” or “-” sign to indicate the nature of their relationship. Positive edges typically signify cooperation, agreement or trust, while negative edges denote conflict, disagreement or distrust. This dual nature allows signed networks to model complex social dynamics more realistically than other networks which assume all interactions are neutral.

Figure 1, Example of a signed network

In social networks, signed edges can reveal patterns of cooperation and conflicts within communities. Positive edges often cluster within groups, reflecting tight social bonds and negative edges occur more frequently between groups than within groups, indicating disagreements. In the above figure (Figure 1), there are two edges between nodes V₂ - V₃ and V₂ - V₄, whereas all other interactions are positive. It is important to note that the meanings of positive and negative signed edges are different across settings but the core concepts stay the same.

Balance Theory

Structural balance theory is a theoretical concept that originated in social psychology in the mid 20th century. It considers the possible ways in which triangles on three individuals can be signed and posits that triangles with three positive signs (three mutual friends) and those with one positive sign (one common enemy) are more plausible and should be seen frequently in real networks than triangles with two positive signs (two enemies with a common friend) or none (three mutual enemies).

Figure 2, Balanced and unbalanced triads.

Balanced triangles with three positive edges (first triad in Figure 2) amplify the principle that “the friend of my friend is my friend” where as those with one positive and two negative edges (second triad in Figure 2) capture the notions that “the enemy of my friend is my enemy” and “enemy of my enemy is my friend”.

SEMBA - The Algorithm & The Approach

The authors call the proposed algorithm which learns the representations of nodes present in a signed network - Signed Links Evolution using Memory modules and Balanced Aggregations or in short SEMBA. The algorithm learns two representations per node: positive and negative memories to encode temporal signed interactions and node embeddings to incorporate interactions with its long - term neighbors. And the authors try to test the effectiveness of the learned representation through various tasks such as

  1. Dynamic Link Existence Prediction: The goal is to predict whether a link exists between nodes u and v in the future.
  2. Dynamic Link Sign Prediction: Tries to predict the sign of the link.
  3. Dynamic Signed Link Existence Prediction: The goal is to predict whether a positive, negative or link exists between nodes u and v in the future.
  4. Dynamic Signed Link Weight Prediction: The goal is to predict the link weight between nodes u and v.

As per the authors, SEMBA is motivated by three key observations,

  1. Links Evolution: In an evolving network, new links are added based on the past interactions of a particular node at any given period of time.
  2. Structural Balance: Based on balance theory, the future link between nodes u and v and their representations are solely depended on information coming from either a friend of a friend or an enemy of a friend.
  3. Long-tailed distribution: Nodes of a real network follow a power - law distribution i.e. only few nodes possess high degree.

Following these three observations, the authors introduce three major components to the architecture and they are,

Signed Memories : This means each node will have two individual memories to save the past positive and negative interactions which can later be used to predict future links.

Balanced Aggregation : When a new signed link (u, v) is added, one of the below two things can happen,

  1. if the sign of the link is positive, memories of the same polarity influence each other. In other words, positive memory of the node u will be updated using the positive memory of node v.
  2. if the sign is negative, memories of opposite polarity provide influence. That is the positive memory of the node u is updated using the negative memory of v and u’s negative memory is updated using v’s positive memory.

Long term Propagation : To generate embeddings for a node, author decided to aggregate the neighbors of it in the past which eventually solves the staleness problem through propagating information from neighbors over all time steps.

Model Pipeline

The whole SEMBA pipeline is arranged into 4 steps and they are,

  1. Event encoding as signed messages based on balance theory
  2. Signed messages aggregation
  3. Node memory updation using aggregated messages
  4. Embedding generation
SEMBA model pipeline

Message Generator : Given an interaction (u, v), signed messages of node u and v will be generated based on balance theory such that u’s information will be passed on to v and vice versa. Specifically, when the sign is positive, information in the positive memory of the node u will be used by the positive memory of the node v. In a similar fashion, negative information will also be shared.

Message calculation

The above two equations demonstrate how messages are generated by passing information from node v to u, where

  • Sᵤ⁺ and Sᵤ⁻ are positive and negative memories of u respectively and Sᵥ⁻ and Sᵥ⁺ are positive and negative memories of v, similarly.
  • is the concatenation operator which merges the information of two nodes together.
  • t - Δtᵤ is an operation used to quantify the last time node u was updated.
  • |eᵤᵥ(t)| denotes the edge weight.
  • msg is a multi-layer perceptron (MLP)

Message Aggregation : There can be multiple messages formed per node when there are many interactions present in the current batch corresponding to a same node. So it is adequate to aggregate these positive and negative messages into single positive and negative messages respectively. The aggregation function can be any one of the mean aggregator, LSTM or a function that returns the most recent message out of a set of messages. The authors chose to go with the last one of all the options and have reported that they did not observe any significant differences in the results.

Memory Update : Once the aggregated messages are calculated, the next step is to update the positive and negative memories of a node. To do so, authors simply pass the newly aggregated message and the old memory representation of a node through an LSTM layer. This will result in a updated memory using both the older and newer information.

Embedding Generation: The last step in the model pipeline is to generate the actual embedding corresponding to a node u, for this, the memory of the node u will be jointly used with all its neighbors across timestamps.

Embedding generation
  • Here Sᵤ is a message that is created by concatenating the positive and negative updated memories of node u.
  • Updated memories of neighbors of node u of all timestamps will be weighted and averaged.
  • The updated memory of a node and its neighbors help in generating the temporal sign aware embedding which can be used for downstream tasks.

Evaluation

We will first take a step back to discuss the different metrics used in the evaluation procedure for each task in hand. For binary classification tasks such as Dynamic Link Sign Prediction and Dynamic Link Existence Prediction, the metrics used are F1 score and AUROC (Area Under Receiver Operating Characteristic). F1 is mainly used by the virtue of class imbalances present in the train and evaluation data. For Dynamic Link Signed Weight Prediction which is a regression task, RMSE, R² and KL Divergence are used to quantify the performance.

In Dynamic Link Existence Prediction, the proposed SEMBA works on par with the state-of-the-art TGN model with a maximum difference of 0.01 in the F1 and AUROC metrics. The model outperforms all the baseline models across all the datasets with substantial margins. This task predict the sign of a future link t i.e. whether it is positive or negative. The authors combine these two tasks into one single multi-class classification task called Dynamic Signed Link Existence Prediction where the objective is to predict whether a link exists between two nodes in the future and if it exists, predict its sign. Authors observe that SEMBA outperforms or matches the baselines across all the datasets. The last task, Dynamic Link Signed Weight Prediction i.e. predicting the signed weights of links at t given the graph until the previous batch. Again, authors report that SEMBA either outperforms or matches the baselines.

Conclusion

The core idea of this research was to learn the dynamics of a signed network’s evolution using a GNN based model with some components shared with the very famous TGN architecture. Balance theory makes sure how information is shared among nodes when a new link is created by formulating it in the form of a balanced triad. The research did not mention anything substantial about graph events anything other than edge addition but I hope the authors may be explore them in the future versions.

It was fun reading this and I think having some understanding about the TGN architecture helped a lot.

Thanks for spending some time here.

--

--

No responses yet