[CS224w] Colab 2 - PyG, OGB, GNN

728x90

Device

런타임 > 런타임 유형 변경 > 하드웨어 가속기 > GPU로 설정하고 저장

Setup

import torch
import os
print("PyTorch has version {}".format(torch.__version__))

PyTorch has version 1.13.1+cu116

# Install torch geometric
if 'IS_GRADESCOPE_ENV' not in os.environ:
  !pip install torch-scatter -f https://pytorch-geometric.com/whl/torch-1.13.1+cu116.html
  !pip install torch-sparse -f https://pytorch-geometric.com/whl/torch-1.13.1+cu116.html
  !pip install torch-geometric
  !pip install ogb
  
  
from torch_geometric.datasets import TUDataset

if 'IS_GRADESCOPE_ENV' not in os.environ:
  root = './enzymes'
  name = 'ENZYMES'

  # The ENZYMES dataset
  pyg_dataset= TUDataset(root, name)

  # You will find that there are 600 graphs in this dataset
  print(pyg_dataset)
  
Downloading https://www.chrsmrrs.com/graphkerneldatasets/ENZYMES.zip
Extracting enzymes/ENZYMES/ENZYMES.zip
Processing...
ENZYMES(600)
Done!

get_summary() 메소드를 이용하여 간단한 정보를 볼 수 있다.

print(pyg_dataset.get_summary())
TUDataset (#graphs=600):
+------------+----------+----------+
|            |   #nodes |   #edges |
|------------+----------+----------|
| mean       |     32.6 |    124.3 |
| std        |     15.3 |     51   |
| min        |      2   |      2   |
| quantile25 |     22   |     86   |
| median     |     32   |    120   |
| quantile75 |     41   |    164   |
| max        |    126   |    298   |
+------------+----------+----------+

Question 1: What is the number of classes and number of features in the ENZYMES dataset?

def get_num_classes(pyg_dataset):
  # TODO: Implement a function that takes a PyG dataset object
  # and returns the number of classes for that dataset.

  num_classes = 0

  ############# Your code here ############
  ## (~1 line of code)
  ## Note
  ## 1. Colab autocomplete functionality might be useful.
  num_classes = pyg_dataset.num_classes
  #########################################

  return num_classes

def get_num_features(pyg_dataset):
  # TODO: Implement a function that takes a PyG dataset object
  # and returns the number of features for that dataset.

  num_features = 0

  ############# Your code here ############
  ## (~1 line of code)
  ## Note
  ## 1. Colab autocomplete functionality might be useful.
  num_features = pyg_dataset.num_features
  #########################################

  return num_features

if 'IS_GRADESCOPE_ENV' not in os.environ:
  num_classes = get_num_classes(pyg_dataset)
  num_features = get_num_features(pyg_dataset)
  print("{} dataset has {} classes".format(name, num_classes))
  print("{} dataset has {} features".format(name, num_features))
  
ENZYMES dataset has 6 classes
ENZYMES dataset has 3 features

※ PyG dataset은 여러가지 유용한 멤버변수를 갖는다.

num_classes, num_features
num_edge_attributes, num_edge_features, num_edge_labels
num_node_attributes, num_node_features, num_node_labels

Question 2: What is the label of the graph with index 100 in the ENZYMES dataset?

아래 torch_geometric 도큐먼트를 통해 기본적인 파라미터가 있다는 것을 알 수 있다.

https://pytorch-geometric.readthedocs.io/en/latest/generated/torch_geometric.data.Data.html#torch_geometric.data.Data

torch_geometric.data.Data — pytorch_geometric documentation

Returns the number of nodes in the graph. Note The number of nodes in the data object is automatically inferred in case node-level attributes are present, e.g., data.x. In some cases, however, a graph may only be given without any node-level attributes. Py

pytorch-geometric.readthedocs.io

x
edge_index
edge_attr
y
pos
**kwargs

def get_graph_class(pyg_dataset, idx):
  # TODO: Implement a function that takes a PyG dataset object,
  # an index of a graph within the dataset, and returns the class/label 
  # of the graph (as an integer).

  label = -1

  ############# Your code here ############
  ## (~1 line of code)
  label = pyg_dataset[idx].y
  #########################################

  return label

# Here pyg_dataset is a dataset for graph classification
if 'IS_GRADESCOPE_ENV' not in os.environ:
  graph_0 = pyg_dataset[0]
  print(graph_0)
  idx = 100
  label = get_graph_class(pyg_dataset, idx)
  print('Graph with index {} has label {}'.format(idx, label))
  
Data(edge_index=[2, 168], x=[37, 3], y=[1])
Graph with index 100 has label tensor([4])

Question 3. How many edges does the graph with index 200 have?

그래프가 방향그래프(directed graph)라면 위에 언급한 대로 num_edges를 호출하면 된다.

하지만 무뱡향그래프라면(undirected graph) 중복되는 edge를 세지 않아야 하기 때문에 $2$로 나누어야한다.

is_directed 를 통해 edge 속성을 파악하여 edge의 개수를 계산한다.

def get_graph_num_edges(pyg_dataset, idx):
  # TODO: Implement a function that takes a PyG dataset object,
  # the index of a graph in the dataset, and returns the number of 
  # edges in the graph (as an integer). You should not count an edge 
  # twice if the graph is undirected. For example, in an undirected 
  # graph G, if two nodes v and u are connected by an edge, this edge
  # should only be counted once.

  num_edges = 0

  ############# Your code here ############
  ## Note:
  ## 1. You can't return the data.num_edges directly
  ## 2. We assume the graph is undirected
  ## 3. Look at the PyG dataset built in functions
  ## (~4 lines of code)
  if pyg_dataset[idx].is_directed():
    num_edges = pyg_dataset[idx].num_edges
  else:
    num_edges = pyg_dataset[idx].num_edges // 2
  #########################################

  return num_edges

if 'IS_GRADESCOPE_ENV' not in os.environ:
  idx = 200
  num_edges = get_graph_num_edges(pyg_dataset, idx)
  print('Graph with index {} has {} edges'.format(idx, num_edges))
  
  
Graph with index 200 has 53 edges

2) Open Graph Benchmark (OGB)

Dataset and Data

import torch_geometric.transforms as T
from ogb.nodeproppred import PygNodePropPredDataset

if 'IS_GRADESCOPE_ENV' not in os.environ:
  dataset_name = 'ogbn-arxiv'
  # Load the dataset and transform it to sparse tensor
  dataset = PygNodePropPredDataset(name=dataset_name,
                                  transform=T.ToSparseTensor())
  print('The {} dataset has {} graph'.format(dataset_name, len(dataset)))

  # Extract the graph
  data = dataset[0]
  print(data)
  
Downloading http://snap.stanford.edu/ogb/data/nodeproppred/arxiv.zip
Downloaded 0.08 GB: 100%|██████████| 81/81 [00:07<00:00, 10.64it/s]
Extracting dataset/arxiv.zip
Processing...
Loading necessary files...
This might take a while.
Processing graphs...
100%|██████████| 1/1 [00:00<00:00, 1778.75it/s]
Converting graphs into PyG objects...
100%|██████████| 1/1 [00:00<00:00, 4871.43it/s]Saving...

Done!
The ogbn-arxiv dataset has 1 graph
Data(num_nodes=169343, x=[169343, 128], node_year=[169343, 1], y=[169343, 1], adj_t=[169343, 169343, nnz=1166243])

Question 4. How many features are in the obgn-arxiv graph?

def graph_num_features(data):
  # TODO: Implement a function that takes a PyG data object,
  # and returns the number of features in the graph (as an integer).

  num_features = 0

  ############# Your code here ############
  ## (~1 line of code)
  num_features = data.num_features
  #########################################

  return num_features

if 'IS_GRADESCOPE_ENV' not in os.environ:
  num_features = graph_num_features(data)
  print('The graph has {} features'.format(num_features))
  
  
The graph has 128 features

3) GNN: Node Property Prediction

Setup

import torch
import pandas as pd
import torch.nn.functional as F
print(torch.__version__)

# The PyG built-in GCNConv
from torch_geometric.nn import GCNConv

import torch_geometric.transforms as T
from ogb.nodeproppred import PygNodePropPredDataset, Evaluator

Load and Preprocess the Dataset

if 'IS_GRADESCOPE_ENV' not in os.environ:
  dataset_name = 'ogbn-arxiv'
  dataset = PygNodePropPredDataset(name=dataset_name,
                                  transform=T.ToSparseTensor())
  data = dataset[0]

  # Make the adjacency matrix to symmetric
  data.adj_t = data.adj_t.to_symmetric()

  device = 'cuda' if torch.cuda.is_available() else 'cpu'

  # If you use GPU, the device should be cuda
  print('Device: {}'.format(device))

  data = data.to(device)
  split_idx = dataset.get_idx_split()
  train_idx = split_idx['train'].to(device)
  
Device: cuda

GCN Model

아래 그림의 구조를 따르는 GCN model을 구현해보자.

코드 주석에 적힌 instruction을 보면,

init
- torch.nn.ModuleList를 이용하여 self.convs와 self.bns를 구현한다.
- activation function은 log-softmax를 이용한다.
forward
- 위 그림 구조대로 그래프를 구현한다
- dropout의 파라미터로 self.training을 포함한다
- return_emb가 True라면, log-softmax를 생략하고 리턴한다.

class GCN(torch.nn.Module):
    def __init__(self, input_dim, hidden_dim, output_dim, num_layers,
                 dropout, return_embeds=False):
        # TODO: Implement a function that initializes self.convs, 
        # self.bns, and self.softmax.

        super(GCN, self).__init__()

        # A list of GCNConv layers
        self.convs = None

        # A list of 1D batch normalization layers
        self.bns = None

        # The log softmax layer
        self.softmax = None

        ############# Your code here ############
        ## Note:
        ## 1. You should use torch.nn.ModuleList for self.convs and self.bns
        ## 2. self.convs has num_layers GCNConv layers
        ## 3. self.bns has num_layers - 1 BatchNorm1d layers
        ## 4. You should use torch.nn.LogSoftmax for self.softmax
        ## 5. The parameters you can set for GCNConv include 'in_channels' and 
        ## 'out_channels'. For more information please refer to the documentation:
        ## https://pytorch-geometric.readthedocs.io/en/latest/modules/nn.html#torch_geometric.nn.conv.GCNConv
        ## 6. The only parameter you need to set for BatchNorm1d is 'num_features'
        ## For more information please refer to the documentation: 
        ## https://pytorch.org/docs/stable/generated/torch.nn.BatchNorm1d.html
        ## (~10 lines of code)
        self.convs = [GCNConv(input_dim, hidden_dim)]\
        + [GCNConv(hidden_dim, hidden_dim) for _ in range(num_layers - 2)]\
        + [GCNConv(hidden_dim, output_dim)]
        self.convs = torch.nn.ModuleList(self.convs)

        self.bns = torch.nn.ModuleList([torch.nn.BatchNorm1d(hidden_dim) for _ in range(num_layers - 1)])
        self.softmax = torch.nn.LogSoftmax(dim=1)

        #########################################

        # Probability of an element getting zeroed
        self.dropout = dropout

        # Skip classification layer and return node embeddings
        self.return_embeds = return_embeds

    def reset_parameters(self):
        for conv in self.convs:
            conv.reset_parameters()
        for bn in self.bns:
            bn.reset_parameters()

    def forward(self, x, adj_t):
        # TODO: Implement a function that takes the feature tensor x and
        # edge_index tensor adj_t and returns the output tensor as
        # shown in the figure.

        out = None

        ############# Your code here ############
        ## Note:
        ## 1. Construct the network as shown in the figure
        ## 2. torch.nn.functional.relu and torch.nn.functional.dropout are useful
        ## For more information please refer to the documentation:
        ## https://pytorch.org/docs/stable/nn.functional.html
        ## 3. Don't forget to set F.dropout training to self.training
        ## 4. If return_embeds is True, then skip the last softmax layer
        ## (~7 lines of code)
        for conv, bn in zip(self.convs[:-1], self.bns):
          x = conv(x, adj_t)
          x = bn(x)
          x = F.relu(x)
          x = F.dropout(x, p=self.dropout, training=self.training)
        x = self.convs[-1](x, adj_t)
        out = x if self.return_embeds else self.softmax(x)
        #########################################

        return out

def train(model, data, train_idx, optimizer, loss_fn):
    # TODO: Implement a function that trains the model by 
    # using the given optimizer and loss_fn.
    model.train()
    loss = 0

    ############# Your code here ############
    ## Note:
    ## 1. Zero grad the optimizer
    ## 2. Feed the data into the model
    ## 3. Slice the model output and label by train_idx
    ## 4. Feed the sliced output and label to loss_fn
    ## (~4 lines of code)
    optimizer.zero_grad()
    out = model(data.x, data.adj_t)
    y_pred, y_true = out[train_idx], data.y[train_idx].squeeze()
    loss = loss_fn(y_pred, y_true)
    #########################################

    loss.backward()
    optimizer.step()

    return loss.item()

# Test function here
@torch.no_grad()
def test(model, data, split_idx, evaluator, save_model_results=False):
    # TODO: Implement a function that tests the model by 
    # using the given split_idx and evaluator.
    model.eval()

    # The output of model on all data
    out = None

    ############# Your code here ############
    ## (~1 line of code)
    ## Note:
    ## 1. No index slicing here
    out = model(data.x, data.adj_t)
    #########################################

    y_pred = out.argmax(dim=-1, keepdim=True)

    train_acc = evaluator.eval({
        'y_true': data.y[split_idx['train']],
        'y_pred': y_pred[split_idx['train']],
    })['acc']
    valid_acc = evaluator.eval({
        'y_true': data.y[split_idx['valid']],
        'y_pred': y_pred[split_idx['valid']],
    })['acc']
    test_acc = evaluator.eval({
        'y_true': data.y[split_idx['test']],
        'y_pred': y_pred[split_idx['test']],
    })['acc']

    if save_model_results:
      print ("Saving Model Predictions")

      data = {}
      data['y_pred'] = y_pred.view(-1).cpu().detach().numpy()

      df = pd.DataFrame(data=data)
      # Save locally as csv
      df.to_csv('ogbn-arxiv_node.csv', sep=',', index=False)


    return train_acc, valid_acc, test_acc

# Please do not change the args
if 'IS_GRADESCOPE_ENV' not in os.environ:
  args = {
      'device': device,
      'num_layers': 3,
      'hidden_dim': 256,
      'dropout': 0.5,
      'lr': 0.01,
      'epochs': 100,
  }
  args

# Please do not change these args
# Training should take <10min using GPU runtime
import copy
if 'IS_GRADESCOPE_ENV' not in os.environ:
  # reset the parameters to initial random value
  model.reset_parameters()

  optimizer = torch.optim.Adam(model.parameters(), lr=args['lr'])
  loss_fn = F.nll_loss

  best_model = None
  best_valid_acc = 0

  for epoch in range(1, 1 + args["epochs"]):
    loss = train(model, data, train_idx, optimizer, loss_fn)
    result = test(model, data, split_idx, evaluator)
    train_acc, valid_acc, test_acc = result
    if valid_acc > best_valid_acc:
        best_valid_acc = valid_acc
        best_model = copy.deepcopy(model)
    print(f'Epoch: {epoch:02d}, '
          f'Loss: {loss:.4f}, '
          f'Train: {100 * train_acc:.2f}%, '
          f'Valid: {100 * valid_acc:.2f}% '
          f'Test: {100 * test_acc:.2f}%')

Epoch: 01, Loss: 4.0002, Train: 24.64%, Valid: 28.32% Test: 25.51%
Epoch: 02, Loss: 2.3358, Train: 27.88%, Valid: 25.51% Test: 30.40%
Epoch: 03, Loss: 1.9505, Train: 31.58%, Valid: 31.26% Test: 34.20%
.
.
.
Epoch: 98, Loss: 0.9150, Train: 73.72%, Valid: 71.79% Test: 71.10%
Epoch: 99, Loss: 0.9133, Train: 73.78%, Valid: 71.51% Test: 70.63%
Epoch: 100, Loss: 0.9114, Train: 73.67%, Valid: 70.93% Test: 69.87%

Question 5: What are your best_model validation and test accuracies?

if 'IS_GRADESCOPE_ENV' not in os.environ:
  best_result = test(best_model, data, split_idx, evaluator, save_model_results=True)
  train_acc, valid_acc, test_acc = best_result
  print(f'Best model: '
        f'Train: {100 * train_acc:.2f}%, '
        f'Valid: {100 * valid_acc:.2f}% '
        f'Test: {100 * test_acc:.2f}%')
        
        
Saving Model Predictions
Best model: Train: 72.51%, Valid: 72.36% Test: 71.94%

4) GNN: Graph Property Prediction

Load and preprocess the dataset

from ogb.graphproppred import PygGraphPropPredDataset, Evaluator
from torch_geometric.data import DataLoader
from tqdm.notebook import tqdm

if 'IS_GRADESCOPE_ENV' not in os.environ:
  # Load the dataset 
  dataset = PygGraphPropPredDataset(name='ogbg-molhiv')

  device = 'cuda' if torch.cuda.is_available() else 'cpu'
  print('Device: {}'.format(device))

  split_idx = dataset.get_idx_split()

  # Check task type
  print('Task type: {}'.format(dataset.task_type))
  
Device: cuda
Task type: binary classification

# Load the dataset splits into corresponding dataloaders
# We will train the graph classification task on a batch of 32 graphs
# Shuffle the order of graphs for training set
if 'IS_GRADESCOPE_ENV' not in os.environ:
  train_loader = DataLoader(dataset[split_idx["train"]], batch_size=32, shuffle=True, num_workers=0)
  valid_loader = DataLoader(dataset[split_idx["valid"]], batch_size=32, shuffle=False, num_workers=0)
  test_loader = DataLoader(dataset[split_idx["test"]], batch_size=32, shuffle=False, num_workers=0)

if 'IS_GRADESCOPE_ENV' not in os.environ:
  # Please do not change the args
  args = {
      'device': device,
      'num_layers': 5,
      'hidden_dim': 256,
      'dropout': 0.5,
      'lr': 0.001,
      'epochs': 30,
  }
  args

from ogb.graphproppred.mol_encoder import AtomEncoder
from torch_geometric.nn import global_add_pool, global_mean_pool

### GCN to predict graph property
class GCN_Graph(torch.nn.Module):
    def __init__(self, hidden_dim, output_dim, num_layers, dropout):
        super(GCN_Graph, self).__init__()

        # Load encoders for Atoms in molecule graphs
        self.node_encoder = AtomEncoder(hidden_dim)

        # Node embedding model
        # Note that the input_dim and output_dim are set to hidden_dim
        self.gnn_node = GCN(hidden_dim, hidden_dim,
            hidden_dim, num_layers, dropout, return_embeds=True)

        self.pool = None

        ############# Your code here ############
        ## Note:
        ## 1. Initialize self.pool as a global mean pooling layer
        ## For more information please refer to the documentation:
        ## https://pytorch-geometric.readthedocs.io/en/latest/modules/nn.html#global-pooling-layers
        self.pool = global_mean_pool
        #########################################

        # Output layer
        self.linear = torch.nn.Linear(hidden_dim, output_dim)


    def reset_parameters(self):
      self.gnn_node.reset_parameters()
      self.linear.reset_parameters()

    def forward(self, batched_data):
        # TODO: Implement a function that takes as input a 
        # mini-batch of graphs (torch_geometric.data.Batch) and 
        # returns the predicted graph property for each graph. 
        #
        # NOTE: Since we are predicting graph level properties,
        # your output will be a tensor with dimension equaling
        # the number of graphs in the mini-batch

    
        # Extract important attributes of our mini-batch
        x, edge_index, batch = batched_data.x, batched_data.edge_index, batched_data.batch
        embed = self.node_encoder(x)

        out = None

        ############# Your code here ############
        ## Note:
        ## 1. Construct node embeddings using existing GCN model
        ## 2. Use the global pooling layer to aggregate features for each individual graph
        ## For more information please refer to the documentation:
        ## https://pytorch-geometric.readthedocs.io/en/latest/modules/nn.html#global-pooling-layers
        ## 3. Use a linear layer to predict each graph's property
        ## (~3 lines of code)
        out = self.gnn_node(embed, edge_index)
        out = self.pool(out, batch)
        out = self.linear(out)
        #########################################

        return out

train

optimizer의 변화도 0으로 하기
1. optimzier.zero_grad()
model에 data 입력
1. out = model(batch)
is_labaled로 마스킹된 데이터의 output과 label을 구하기
1. out = out[is_labeled]
2. label = batch.y[is_labeled]
데이터 타입을 torch.float32로 해야할 것이다.
1. label = batch.y[is_labeled].type(torch.float32).do(device)
output과 label을 손실함수 loss_fn에 넣기

def train(model, device, data_loader, optimizer, loss_fn):
    # TODO: Implement a function that trains your model by 
    # using the given optimizer and loss_fn.
    model.train()
    loss = 0

    for step, batch in enumerate(tqdm(data_loader, desc="Iteration")):
      batch = batch.to(device)

      if batch.x.shape[0] == 1 or batch.batch[-1] == 0:
          pass
      else:
        ## ignore nan targets (unlabeled) when computing training loss.
        is_labeled = batch.y == batch.y

        ############# Your code here ############
        ## Note:
        ## 1. Zero grad the optimizer
        ## 2. Feed the data into the model
        ## 3. Use `is_labeled` mask to filter output and labels
        ## 4. You may need to change the type of label to torch.float32
        ## 5. Feed the output and label to the loss_fn
        ## (~3 lines of code)
        optimizer.zero_grad()
        out = model(batch)
        loss = loss_fn(out[is_labeled], batch.y[is_labeled].type(torch.float32).to(device))
        #########################################

        loss.backward()
        optimizer.step()

    return loss.item()

# The evaluation function
def eval(model, device, loader, evaluator, save_model_results=False, save_file=None):
    model.eval()
    y_true = []
    y_pred = []

    for step, batch in enumerate(tqdm(loader, desc="Iteration")):
        batch = batch.to(device)

        if batch.x.shape[0] == 1:
            pass
        else:
            with torch.no_grad():
                pred = model(batch)

            y_true.append(batch.y.view(pred.shape).detach().cpu())
            y_pred.append(pred.detach().cpu())

    y_true = torch.cat(y_true, dim = 0).numpy()
    y_pred = torch.cat(y_pred, dim = 0).numpy()

    input_dict = {"y_true": y_true, "y_pred": y_pred}

    if save_model_results:
        print ("Saving Model Predictions")
        
        # Create a pandas dataframe with a two columns
        # y_pred | y_true
        data = {}
        data['y_pred'] = y_pred.reshape(-1)
        data['y_true'] = y_true.reshape(-1)

        df = pd.DataFrame(data=data)
        # Save to csv
        df.to_csv('ogbg-molhiv_graph_' + save_file + '.csv', sep=',', index=False)

    return evaluator.eval(input_dict)

if 'IS_GRADESCOPE_ENV' not in os.environ:
  model = GCN_Graph(args['hidden_dim'],
              dataset.num_tasks, args['num_layers'],
              args['dropout']).to(device)
  evaluator = Evaluator(name='ogbg-molhiv')

# Please do not change these args
# Training should take <10min using GPU runtime
import copy

if 'IS_GRADESCOPE_ENV' not in os.environ:
  model.reset_parameters()

  optimizer = torch.optim.Adam(model.parameters(), lr=args['lr'])
  loss_fn = torch.nn.BCEWithLogitsLoss()

  best_model = None
  best_valid_acc = 0

  for epoch in range(1, 1 + args["epochs"]):
    print('Training...')
    loss = train(model, device, train_loader, optimizer, loss_fn)

    print('Evaluating...')
    train_result = eval(model, device, train_loader, evaluator)
    val_result = eval(model, device, valid_loader, evaluator)
    test_result = eval(model, device, test_loader, evaluator)

    train_acc, valid_acc, test_acc = train_result[dataset.eval_metric], val_result[dataset.eval_metric], test_result[dataset.eval_metric]
    if valid_acc > best_valid_acc:
        best_valid_acc = valid_acc
        best_model = copy.deepcopy(model)
    print(f'Epoch: {epoch:02d}, '
          f'Loss: {loss:.4f}, '
          f'Train: {100 * train_acc:.2f}%, '
          f'Valid: {100 * valid_acc:.2f}% '
          f'Test: {100 * test_acc:.2f}%')

Question 6: What are your best model validation and test ROC-AUC scores?

if 'IS_GRADESCOPE_ENV' not in os.environ:
  train_acc = eval(best_model, device, train_loader, evaluator)[dataset.eval_metric]
  valid_acc = eval(best_model, device, valid_loader, evaluator, save_model_results=True, save_file="valid")[dataset.eval_metric]
  test_acc  = eval(best_model, device, test_loader, evaluator, save_model_results=True, save_file="test")[dataset.eval_metric]

  print(f'Best model: '
      f'Train: {100 * train_acc:.2f}%, '
      f'Valid: {100 * valid_acc:.2f}% '
      f'Test: {100 * test_acc:.2f}%')

Question 7 (Optional): Experiment with the two other global pooling layers in Pytorch Geometric.

self.pool 에 gloabl_mean_pool 등을 이용하자.

GCN_graph class의 init 파라미터로 pool이 있으면 더 편할 것 같은데...

728x90

'스터디 > 인공지능, 딥러닝, 머신러닝' 카테고리의 다른 글

[GCN] Graph Convolutional Network (0)	2023.04.18
[논문리뷰] TimesNet, Temporal 2D-Variation Modeling for General Time Series Analysis (0)	2023.04.08
[CS224w] 5. A General Perspective on GNNs (1), 이론편 (0)	2023.03.11
[CS224w] 5. A General Perspective on GNNs (2), 아키텍처 (0)	2023.03.10
[CS224w] 4. Graph Neural Networks (0)	2023.03.07

궁금한게많은joon

[CS224w] Colab 2 - PyG, OGB, GNN

Device

Setup

Question 1: What is the number of classes and number of features in the ENZYMES dataset?

Question 2: What is the label of the graph with index 100 in the ENZYMES dataset?

Question 3. How many edges does the graph with index 200 have?

2) Open Graph Benchmark (OGB)

Dataset and Data

Question 4. How many features are in the obgn-arxiv graph?

3) GNN: Node Property Prediction

Setup

Load and Preprocess the Dataset

GCN Model

Question 5: What are your best_model validation and test accuracies?

4) GNN: Graph Property Prediction

Load and preprocess the dataset

train

Question 6: What are your best model validation and test ROC-AUC scores?

Question 7 (Optional): Experiment with the two other global pooling layers in Pytorch Geometric.

'스터디 > 인공지능, 딥러닝, 머신러닝' 카테고리의 다른 글

티스토리툴바

[CS224w] Colab 2 - PyG, OGB, GNN

Device

Setup

Question 1: What is the number of classes and number of features in the ENZYMES dataset?

Question 2: What is the label of the graph with index 100 in the ENZYMES dataset?

Question 3. How many edges does the graph with index 200 have?

2) Open Graph Benchmark (OGB)

Dataset and Data

Question 4. How many features are in the obgn-arxiv graph?

3) GNN: Node Property Prediction

Setup

Load and Preprocess the Dataset

GCN Model

Question 5: What are your best_model validation and test accuracies?

4) GNN: Graph Property Prediction

Load and preprocess the dataset

train

Question 6: What are your best model validation and test ROC-AUC scores?

Question 7 (Optional): Experiment with the two other global pooling layers in Pytorch Geometric.

'스터디 > 인공지능, 딥러닝, 머신러닝' 카테고리의 다른 글

관련글

티스토리툴바