How To Import The MNIST Dataset From Local Directory Using PyTorch

Written by- Aionlinecourse1256 times views

Image

This dataset is an artificial dataset that has been designed to emulate the process of digit recognition. It contains 60,000 labeled examples of 26 different digits. Designs for datasets like MNIST are useful because they offer a simple framework for performing classic deep learning experiments.

How To Import The MNIST Dataset From Local Directory Using PyTorch:

Method 1:

You can import data using this format

xy_trainPT = torchvision.datasets.MNIST(
    root="~/Handwritten_Deep_L/",
    train=True,
    download=True,
    transform=torchvision.transforms.Compose([torchvision.transforms.ToTensor()]),
)

Now, what is happening at download=True first your code will check at the root directory (your given path) contains any datasets or not.

If no then datasets will be downloaded from the web.

If yes this path already contains a dataset then your code will work using the existing dataset and will not download from the internet.

You can check, first give a path without any dataset (data will be downloaded from the internet), and then give another path which already contains dataset data will not be downloaded.


Method 2:

Welcome to stackoverflow !

The MNIST dataset is not stored as images, but in a binary format (as indicated by the ubyte extension). Therefore, ImageFolderis not the type dataset you want. Instead, you will need to use the MNIST dataset class. It could even download the data if you had not done it already :)

This is a dataset class, so just instantiate with the proper root path, then put it as the parameter of your dataloader and everything should work just fine.

If you want to check the images, just use the getmethod of the dataloader, and save the result as a png file (you may need to convert the tensor to a numpy array first).


Thank you for reading the article.