Loading...
「ツール」は右上に移動しました。
利用したサーバー: wtserver1
43いいね 3,465 views回再生

Automated Data Profiling using ydata-profiling on Pandas Dataframe - #azuredatabricks

#pandasprofiling #python #pandas #dataquality #azuredatabricks #azuredatafactory #azuredataengineer #databricks #dataanalysis

In this session we discussed on how to perform data profiling using ydata-profiling library. For Demo purpose , we have used Jupyter, you can also apply this on your databricks and data stored in your Azure Storage Location

Link for ydata-profiling page : https://pypi.org/project/ydata-profil...
Link for csv data set : https://www.kaggle.com/datasets/matto...


Sample Code :
pip install ydata-profiling

import pandas as pd
df1 = pd.read_csv(r"D:\Data_Quality\Selected_Online_Sport_Wagering_Data.csv")

from ydata_profiling import ProfileReport
from ydata_profiling.utils.cache import cache_file
report=ProfileReport(df1,title="Quality_Test", explorative=True)
report.to_file("D:\Data_Quality\Data_results.html")



#dataprofiling
#dataengineeringessentials
#dataengineering
#dataengineer
#pandas #pyspark
#KnowledgeShare
#ydata-quality
#dataquality
#python
#automateddataprofiling

コメント