11 Essential steps of Data Cleaning for Beginners

Maria Asghar
2 min readSep 6, 2022

Data cleaning is an essentials step before analyzing your data. Any mild inconsistency in data can result in ambiguous and wrong findings.

google sheet snippet.

There are some of the necessary steps that should be considered while cleaning data. Every dataset is different so cleaning steps can be different for every type but here are the mostly used steps for data cleaning in Excel or Google Sheets:

  1. Duplicates : To check if there are any duplicate values, you can use Remove Duplicate function. To remove duplicates go to google sheets, select the column from where you want to remove duplicates, go to ’ Data’, select ‘Data Cleanup’, and choose ‘Remove Duplicates’.
  2. Null Values: Check for null values you can use conditional formatting and filters in Excel or Google sheets.
  3. Misspelled words: Make sure to locate any misspelling.
  4. Misspelled numbers: double-check that your numeric data has been entered correctly. In order to correct the type select the desired numeric type from google sheets top header.
  5. Extra Spaces and characters: Check for any extra spaces or characters and use TRIM function to remove them.
  6. Mismatched data type: Check if there is any mismatch in type of string, numeric or date. If there is any mismatch, correct it.
  7. Inconsistent String: Make sure all of the strings are consistent and meaningful.
  8. Inconsistent Date format: Format the date format in entire dataset to avoid any inconsistencies.
  9. Truncated Data: Check for truncated or missing data that needs correction.
  10. Misleading column names: Name your columns meaningfully.
  11. Business Logic: Make sure that data makes sense given your knowledge of the business. If you think that some data is irrelevant, don’t include that data in your analysis to avoid any wrong conclusions.

As mentioned earlier each dataset can have different types of inconsistencies, they will require different steps to deal with them. Most of the dataset will require the above checklist to make sure it is clean. Understanding the commons errors in various data sets will require practice. The above 11 steps can help you getting started on this journey!

For further information and functions in Google sheets check out Google Workspace learning center. For Excel you can check this resource.

Happy Learning!

--

--