Hi,
In this post, I am sharing how to work and load data sets that are stored in Azure blob storage into Pandas data frame.
I have the full code posted in Azure notebooks. This code snippet is useful to use in any Jupyter notebook while working on your data pipeline while developing Machine Learning models.
I have exported a data set into a csv file and stored it into an Azure blob storage so i can use it into my notebooks.
Python code snippet:
The full code snippet is posted in Azure Notebook here.
Enjoy!
In this post, I am sharing how to work and load data sets that are stored in Azure blob storage into Pandas data frame.
I have the full code posted in Azure notebooks. This code snippet is useful to use in any Jupyter notebook while working on your data pipeline while developing Machine Learning models.
I have exported a data set into a csv file and stored it into an Azure blob storage so i can use it into my notebooks.
Python code snippet:
import pandas as pd
import time
# import azure sdk packages
from azure.storage.blob import BlobService
def readBlobIntoDF(storageAccountName, storageAccountKey, containerName, blobName, localFileName):
# get an instance of blob service
blob_service = BlobService(account_name=storageAccountName, account_key= storageAccountKey)
# save file content into local file name
blob_service.get_blob_to_path(CONTAINERNAME,blobName,localFileName)
# load local csv file into a dataframe
dataframe_blobdata = pd.read_csv(localFileName, header=0)
return dataframe_blobdata
STORAGEACCOUNTNAME= 'STORAGE_ACCOUNT_NAME'
STORAGEACCOUNTKEY= 'STORAGE_KEY'
CONTAINERNAME= 'CONTAINER_NAME'
BLOBNAME= 'BLOB_NAME.csv'
LOCALFILENAME = 'FILE_NAME-csv-local'
# load blob file into pandas dataframe
tmp = readBlobIntoDF(STORAGEACCOUNTNAME,STORAGEACCOUNTKEY,CONTAINERNAME,BLOBNAME, LOCALFILENAME)
tmp.head()
The full code snippet is posted in Azure Notebook here.
Enjoy!