A modular implementation of Variational Graph Auto-Encoders (VGAE) for link prediction on graph-structured data.
Course Project: COMP 790 (Generative Methods in Machine Learning, Graduate) — UNC Chapel Hill, Spring 2019
Authors: Alex Kan and Armaan Sethi
Completed as an undergraduate taking a graduate course
Link prediction answers the question: "Given a graph with some edges, which edges are missing?"
This is useful for:
- Social networks: Recommending friends ("People you may know")
- Knowledge graphs: Inferring missing facts (e.g., "X was born in Y")
- Citation networks: Predicting which papers should cite each other
- Drug discovery: Predicting protein-protein interactions
The challenge: graphs are irregular (unlike images), so we can't use standard CNNs.
We combine two powerful ideas:
GCNs generalize convolutions to graphs. Instead of sliding a filter over pixels, they aggregate information from neighboring nodes:
h_v = σ( Σ (1/√(d_u·d_v)) · W · h_u ) for all neighbors u of v
This lets each node "see" its local neighborhood and learn useful representations.
VAEs learn a compressed latent representation by:
- Encoding input → latent distribution (mean μ, variance σ)
- Sampling from that distribution
- Decoding back to reconstruct the input
- Regularizing the latent space to be smooth (via KL-divergence)
The "variational" part is key: instead of mapping each node to a single point, we map it to a distribution. This gives us:
- Better generalization
- Ability to sample new graphs
- More robust learned representations
Graph (nodes + edges)
→ GCN Encoder → Node embeddings (μ, σ for each node)
→ Sample z ~ N(μ, σ)
→ Inner product decoder → Predicted edges
Training objective (ELBO):
- Reconstruction loss: Did we predict the right edges?
- KL loss: Is our latent space well-regularized?
| Dataset | Nodes | Edges | Baseline (16-dim) | Optimized (64-dim) |
|---|---|---|---|---|
| Cora | 2,708 | 5,429 | 0.82 / 0.83 | 0.86 / 0.87 |
| Citeseer | 3,327 | 4,732 | 0.82 / 0.83 | 0.88 / 0.89 |
| Pubmed | 19,717 | 44,338 | 0.80 / 0.81 | 0.85 / 0.86 |
- AUC (Area Under ROC Curve): How well can we distinguish real edges from non-edges?
- AP (Average Precision): Precision at various recall thresholds
-
Latent dimension matters more than depth
- Going from 16 → 64 dimensions: +6% AUC
- Adding more GCN layers: minimal improvement
- Why? Larger latent space can capture more nuanced node relationships. But stacking GCN layers causes over-smoothing (all nodes start looking similar).
-
The model is surprisingly robust
- Works well even with default hyperparameters
- The VAE regularization prevents overfitting
-
Inner product decoder is simple but effective
- We tried fancier decoders (with learned parameters)
- Simple dot product (z_i · z_j) worked just as well
Input: Node features X ∈ ℝ^(N×F), Adjacency A
↓
┌─────────────────────────────────────────────────┐
│ Encoder (2-3 GCN layers) │
│ │
│ h = ReLU(GCN₁(X, A)) # First layer │
│ μ = GCN_μ(h, A) # Mean │
│ log(σ²) = GCN_σ(h, A) # Log-variance │
└─────────────────────────────────────────────────┘
↓
Reparameterization Trick
z = μ + ε·σ, where ε ~ N(0,1)
(Allows backprop through sampling)
↓
┌─────────────────────────────────────────────────┐
│ Decoder (Inner Product) │
│ │
│ P(edge between i,j) = σ(z_i · z_j) │
│ (Sigmoid of dot product) │
└─────────────────────────────────────────────────┘
↓
Loss = BCE(predicted, actual) + β·KL(q(z|X,A) || p(z))
├── newvgae/ # Main implementation (PyTorch Geometric)
│ ├── vgae.py # GAE and VGAE model classes
│ ├── model.py # Encoder architectures
│ └── utils.py # Data loading and preprocessing
│
├── vgae/ # Earlier implementation (more manual)
│ ├── models.py # Model definitions
│ ├── layers.py # GCN layers from scratch
│ └── train.py # Training script
│
├── notebooks/ # Jupyter notebooks for experiments
├── scripts/ # Training and evaluation scripts
├── reports/ # Academic reports
│ ├── KanSethiFinalReport.pdf
│ ├── Kan_Sethi_Midway_Report.pdf
│ └── KanSethiProposal.pdf
└── figures/ # Generated visualizations
pip install torch torch-geometric networkx scikit-learn numpyfrom newvgae.vgae import VGAE
from newvgae.model import Encoder
from torch_geometric.datasets import Planetoid
# Load Cora citation network
dataset = Planetoid(root='/tmp/Cora', name='Cora')
data = dataset[0]
# Build model
encoder = Encoder(in_channels=dataset.num_features,
hidden_channels=32,
out_channels=64) # 64-dim latent space
model = VGAE(encoder)
# Train
optimizer = torch.optim.Adam(model.parameters(), lr=0.01)
for epoch in range(200):
model.train()
optimizer.zero_grad()
z = model.encode(data.x, data.edge_index)
loss = model.recon_loss(z, data.edge_index) + model.kl_loss()
loss.backward()
optimizer.step()
# Evaluate
model.eval()
auc, ap = model.test(z, pos_edges, neg_edges)
print(f"AUC: {auc:.3f}, AP: {ap:.3f}")- Kipf & Welling, "Variational Graph Auto-Encoders" (NIPS 2016 Workshop)
- Kipf & Welling, "Semi-Supervised Classification with Graph Convolutional Networks" (ICLR 2017)
- Final Report — Full methodology, experiments, and analysis
- Presentation Slides
COMP 790: Generative Methods in Machine Learning (Graduate) — UNC Chapel Hill, Spring 2019
Completed as an undergraduate