A trainer based on Pytorch

Project description

Torch Terinador

A trainer based on pytorch including a train loop for MDN (Mixture Density Network), a data loader, plot line chart and a couple of techniques for avoid over fitting

Installation

This package needs Python>=3.7 and the version of Pytorch used in development is 1.13.1 and cuda11.2, considering the different version of cuda, the package will not install Pytorch automatically. You should check your cuda's version, install the suitable pytorch first. Then, run the command below:

pip install tortreinador

Quick Start

from tortreinador import train
from tortreinador.models.MDN import mdn, Mixture, NLLLoss
import torch
import pandas as pd

data = pd.read_excel('D:\\Resource\\Gas_Giants_Core_Earth20W.xlsx')
data['M_total (M_E)'] = data['Mcore (M_J/10^3)'] + data['Menv (M_E)']

# Support index, e.g input_parameters = [0, 1, 2]
input_parameters = [
    'Mass (M_J)',
    'Radius (R_E)',
    'T_sur (K)',
]

output_parameters = [
    'M_total (M_E)',
    'T_int (K)',
    'P_CEB (Mbar)',
    'T_CEB (K)'
]

trainer = train.TorchTrainer()

# Load Data
t_loader, v_loader, test_x, test_y, s_x, s_y = trainer.load_data(data=data, input_parameters=input_parameters,
                                                                 output_parameters=output_parameters,
                                                                 if_normal=True, if_shuffle=True)

# Model
model = mdn(len(input_parameters), len(output_parameters), 10, 256)

# Loss
criterion = NLLLoss()
pdf = Mixture()

# Optimizer
optim = torch.optim.Adam(trainer.xavier_init(model), lr=0.0001984, weight_decay=0.001)

# Training
t_l, v_l, val_r2, train_r2, mse = trainer.fit_for_MDN(t_loader, v_loader, criterion, model=model, mixture=pdf,
                                                      model_save_path='D:\\Resource\\MDN\\', optim=optim, best_r2=0.5)
# Plot line chart
result_pd = pd.DataFrame()
result_pd['epoch'] = range(150)
result_pd['train_r2_avg'] = train_r2
result_pd['val_r2_avg'] = val_r2

trainer.plot_line_2(y_1='train_r2_avg', y_2='val_r2_avg', df=result_pd, fig_size=(10, 6), output_path="your save path", dpi=300)

Functions

This package just support MDN for now, but the load_data is suitable for every condition as long as the type of data is Dataframe

tortreinador.train.TorchTrainer():
- Parameters:
  - batch_size: int = 512
  - is_gpu: bool = True
  - epoch: int = 150
  - log_dir: Optional[str] = None, Specify a file path to start up tensorboardX
- Functions:
  - load_data()
    - Describe: Processing Dataframe according to the input/output parameters and split size to train set, validation set and test set, you can freely choose if normalization, if shuffle.
    - Parameters:
      - data: DataFrame,
      - input_parameters: list,
      - output_parameters: list,
      - feature_range: Any = None,
      - train_size: float = 0.8,
      - val_size: float = 0.1,
      - test_size: float = 0.1,
      - if_normal: bool = True,
      - if_shuffle: bool = True,
      - n_workers: int = 8
    - Return:
      - DataLoader
      - DataLoader
      - Numpy array
      - Numpy array
      - MinMaxScaler
      - MinMaxScaler
  - plot_line_2()
    - Describe: This function is usually used after training to compare the validation loss and train loss, validation R2 and train R2
    - Parameters:
      - y_1: str
      - y_2: str
      - df: DataFrame
      - output_path: str
      - fig_size: tuple = (10, 6)
      - dpi: int = 300
  - xavier_init()
    - Describe: A technique for prevent over fitting
    - Parameters:
      - net: Module
  - _calculate()
    - Describe: A private method for calculate loss, it is able to overwrite(Testing)
    - Parameters:
      - model: Any,
      - pdf: Any,
      - x: Any,
      - y: Any,
      - criterion: Any,
      - t: str = 'train'
```
  def _calculate(self, model, pdf, x, y, criterion, t='train'):
      pi, mu, sigma = model(x)

      mixture = pdf(pi, mu, sigma)

      y_pred = mixture.sample()

      if t == 'train':
        return criterion(pi, mu, sigma, y), y_pred.cpu().numpy(), y.cpu().numpy()

      else:
        return criterion(pi, mu, sigma, y), self.mse(y_pred, y), y_pred.cpu().numpy(), y.cpu().numpy()
```
  - fit_for_MDN()
    - Describe: Train loop for MDN(Mixture Density Network)
    - Parameters:
      - t_l: Dataloader
      - v_l: Dataloader
      - criterion: Module
      - optim: Optimizer
      - model: Module
      - model_save_path: str The model which has the best performance will save according model_save_path
      - mixture: Module The sampling class inherited from nn.Module
      - warmup_epoch: Optional[int] = None A technique for prevent over fitting, specify a number such as 5, then it will use warm up in the first 5 epoch
      - lr_milestones: Optional[list] = None Decrease learning rate according the input list and gamma, for example: lr_milestones=[10], gamma=0.7, then the learning rate will x0.7 at the 10 epoch
      - gamma: float = 0.7
      - best_r2: float = 0.80
    - Return:
      - train loss: List
      - validation loss: list
      - validation R2: list
      - train R2: list
      - validation mse: list

Project details

Development Status
- 3 - Alpha
Environment
- GPU :: NVIDIA CUDA :: 11.2
License
- OSI Approved :: MIT License
Operating System
- OS Independent
Programming Language
- Python :: 3

Release history Release notifications | RSS feed

This version

0.0.6

Jan 16, 2024

0.0.5

Dec 9, 2023

0.0.4

Dec 9, 2023

0.0.3

Dec 9, 2023

0.0.2

Dec 9, 2023

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

tortreinador-0.0.6.tar.gz (13.2 kB view hashes)

Uploaded Jan 16, 2024 Source

Built Distribution

tortreinador-0.0.6-py3-none-any.whl (12.7 kB view hashes)

Uploaded Jan 16, 2024 Python 3

Hashes for tortreinador-0.0.6.tar.gz

Hashes for tortreinador-0.0.6.tar.gz
Algorithm	Hash digest
SHA256	`4db310ead64312130886d4df9863469e8b81723ca1917d7a6cb3bf2378355856`
MD5	`2b6a54606c29a1304ef62bc997102927`
BLAKE2b-256	`1ca92a58cdfef2cd728cbdbae53d09e87c571af0183bf536e38cb58a69236398`

Hashes for tortreinador-0.0.6-py3-none-any.whl

Hashes for tortreinador-0.0.6-py3-none-any.whl
Algorithm	Hash digest
SHA256	`f4f7761608ad154879a7f4f56ff9621ac0af61caa6ef100e2c15fbfc3d7ba777`
MD5	`1b92f01b74b97d9ecdcf9f6d23d7bc83`
BLAKE2b-256	`8ccc1736fb6deeb4f7f06135736f659556f30e177ac0ea77db42733cd7228afc`