🛠️ Building... and rebuilding... and rebuilding

Samarth Garge SamarthGarge

🛠️ Building... and rebuilding... and rebuilding

SamarthGarge / data_cleaning_functions.py

Created March 26, 2025 10:58

Outliers feature : 2 ways to use it => 1) From the outliers.py function 2) By integrating it in the clean_dataframe function

	def clean_dataframe(df, null_method='nan', fix_numeric=True, remove_dups=True, dup_subset=None,
	detect_outliers_flag=False, outlier_columns=None, outlier_method='zscore',
	outlier_threshold=3, outlier_processing='remove', outlier_cap_values=None):
	"""
	Apply all cleaning functions in sequence

	Parameters:
	-----------
	df : pandas.DataFrame
	The dataframe to clean

SamarthGarge / data_cleaning_functions.py

Created March 23, 2025 10:21

Clean_data

	import pandas as pd
	import numpy as np
	import re
	import os
	from pathlib import Path

	def handle_null_values(df, method:'nan', columns=None):
	"""
	Check for null values and replace them with specified value.