Is scripted Pytorch model faster?
In a machine learning pipeline, understanding how to script a PyTorch model is essential. As explained in an excellent introductory post, a few advantages of scripting include:
- Saving and transferring the model to environments outside of Python
- Obtaining an intermediate representation that can be further optimized
Recently, I explored AWS SageMaker, focusing specifically on deploying a custom model within the service. Through the official documentation, I discovered that the TorchScript format is also a standard format for saving trained models.
In this post, I would like to rather focus on the performance aspects of scripted models, particularly in light of claimed performance benefits here. To investigate this, I extended the original script, varying the batch size and increasing the number of repetitions.
import torchvision
import torch
from time import perf_counter
import numpy as np
from torchvision.models import ResNet18_Weights
def timer(f,*args):
start = perf_counter()
f(*args)
return 1000*(perf_counter() - start)
def get_model(device='cpu', scripted=False, a=None):
model = torchvision.models.resnet18(weights=ResNet18_Weights.DEFAULT).to(device)
model.eval()
if scripted:
with torch.jit.optimized_execution(True):
model = torch.jit.script(model, a)
return model
def get_tensor(device='cpu', bs=1):
return torch.rand(bs, 3, 224, 224).to(device)
for scripted_mode in [False, True]:
for device in ['cpu', 'cuda']:
for bs in [1, 32, 128]:
a = get_tensor(device, bs)
model = get_model(device, scripted_mode, a)
res = np.mean([timer(model,a) for _ in range(100)])
print(
f"Scripted: {scripted_mode}, Device: {device}, BS: {bs} Time: {res:.3f}"
)
And here are the collected results:
Based on my experiment results, I did not find generalized performance benefits as claimed in the post, for either CPU or GPU devices. Furthermore, there’s an ongoing discussion about this in the PyTorch forums. In summary, I found comparable performance for both scripted and non-scripted models. Although the scripted version is faster for a single batch, the trend reverses as batch size increases.