I have cleaned this data using SQL statements. Here the data was retrieved, understood and then accordingly updated.
- Converting String Date to Date datatype.
- Updating missing values in Address Column.
- Formating ParcelID, Acreage, LandValue
- Breaking out PropertyAddress, Owner Address into individual columns
- Change Y and N to Yes and No in 'SoldasVacant' field.
- Removing Duplicates and redudndant columns
This is a guided project from AlextheAnalyst: https://github.com/AlexTheAnalyst/PortfolioProjects/blob/main/Nashville%20Housing%20Data%20for%20Data%20Cleaning.xlsx
- Dataset: Nashville_Housing_Data.xlsx in
.csv - Code: Nashville_Data_Cleaning.sql
.sql
- Docker
- Azure Data Studio
This dataset has 54403 records. It primarily describes the Nashville property's detail containing the Owner and its property details.
Description of the variables:
UniqueID: Primary keyParcelID: varcharLandUse: varchar, Type of the property e.g Condo, Church, Apartment, DaycarePropertyAddress: varcharSaleDate: DateSalePrice: IntegerLegalReference: varcharSoldAsVacant: Bool, Yes/ NoOwnerName: varcharOwnerAddress: varcharAcreage: FloatTaxDistrict: varcharLandValue: IntegerBuildingValue: IntegerTotalValue: IntegerYearBuilt: IntegerBedrooms: IntegerFullBath: IntegerHalfBath: Integer