Delete duplicates in Google Sheets: A practical step-by-step guide

Learn practical, proven methods to delete duplicates in Google Sheets. This step-by-step guide covers built-in tools, formulas, and automation for clean, reliable data without losing important context.

How To Sheets
How To Sheets Team
·5 min read
Deduplicate in Sheets - How To Sheets
Photo by StockSnapvia Pixabay
Quick AnswerSteps

You will learn how to identify and delete duplicate rows in Google Sheets using a mix of built-in tools, formulas, and automation. By the end, you’ll confidently remove duplicates while preserving needed data, with options for single-column and multi-column criteria.

Why duplicates matter in Google Sheets

Duplicates in Google Sheets can distort analysis, inflate totals, and obscure true trends. They often appear when importing data from other systems, copying and pasting between sheets, or combining datasets with slightly different formats. If duplicates aren’t handled, you may misreport results, create inconsistent records, or waste time cleaning data before every analysis. A consistent deduplication workflow improves data quality and speeds up decision-making for students, professionals, and small business owners who rely on accurate sheets.

Understanding what counts as a duplicate is essential. Depending on the task, duplicates can mean identical rows, identical values within a single column, or identical key combinations across several columns. Defining the uniqueness criteria upfront helps you choose the right method and minimizes accidental data loss. The goal is to keep one canonical record for each unique entry while preserving context for related fields.

According to How To Sheets, establishing a clear deduplication strategy reduces manual work and increases reliability across recurring tasks. This guide presents practical, repeatable steps you can apply to most projects in Google Sheets.

Common scenarios where duplicates show up

Duplicates typically appear in scenarios like importing customer lists from multiple sources, appending new records to an existing sheet, or merging spreadsheets with overlapping entries. In some cases, minor differences—such as trailing spaces, capitalization, or extra punctuation—make rows appear unique even when the core values are the same. When duplicates aren’t handled, you may end up sending communications to the same contact multiple times or double-counting inventory.

To prevent this, plan your deduplication around three questions: Which columns define a unique record? Should duplicates be treated as exact matches (case-sensitive) or flexible matches (case-insensitive)? Do you need to preserve the first occurrence or the most recent one? Having clear answers helps you choose the right method and maintain data integrity across sheets.

Quick identification methods (overview)

There are several approaches to identify duplicates in Google Sheets. The built-in Remove duplicates tool is fast for simple cases where you only need to consider a few columns. Formulas offer more control, letting you flag or separate duplicates for review. Conditional formatting provides a visual cue, so you can review duplicates before deciding what to delete. For automation, Google Apps Script can run deduplication on a schedule or trigger.

Choosing a combination of methods often yields the best result: use built-in removal for cleanup, formulas to verify, and conditional formatting for ongoing monitoring. If your data updates regularly, consider setting up scripts to automate the process while keeping a backup copy.

Method 1: Remove duplicates with the built-in tool

The built-in Remove duplicates tool in Google Sheets is a fast way to prune exact duplicates. Select the range you want to deduplicate, including headers if your sheet uses them. Open Data > Data cleanup > Remove duplicates. Check the columns that define duplicates. If your data has a header row, enable that option so headers aren’t treated as data. Before applying, review a quick preview of what will be removed and confirm one final time. This method works well for straightforward, single-column or multi-column duplicate checks.

Tip: Always run in a copied sheet first to avoid irreversible changes. If you realize an error, you can restore from the backup copy and try a different method.

Method 2: Find duplicates with formulas (multi-column keys)

Formulas provide fine-grained control for multi-column uniqueness. Create a helper column that builds a unique key by concatenating the values of the columns that define identity, using a delimiter to prevent accidental matches. For example, in a new column, use a formula like A2 & "|" & B2 & "|" & C2 to create a composite key. Then, in another column, compute a flag like =COUNTIF($D$2:$D$1000, D2) > 1 to indicate duplicates. You can filter or sort by this flag to review and decide what to remove. This approach is especially valuable when duplicates span multiple columns or when case differences should be ignored.

Pro-tip: Normalize text before creating the key (trim spaces, standardize case) to reduce false positives.

Method 3: Highlight duplicates with conditional formatting

Conditional formatting helps you visually review duplicates before deleting. Select the range and apply a custom formula rule such as =COUNTIF($A$2:$A$1000, $A2) > 1 for single-column checks, or extend to multiple columns with a concatenated key. Choose a bright color so duplicates stand out. This method is best for exploratory cleanup and ensures you can spot patterns across the dataset. Remember to review highlighted rows carefully to avoid removing unique but related records inadvertently.

Tip: Combine with a filter to show only highlighted rows, then delete or adjust as needed.

Method 4: Automate deduplication with Google Apps Script

For recurring deduplication tasks, Apps Script offers a repeatable, automated solution. Write a small script to identify and remove duplicates based on your chosen criteria, or to flag duplicates for manual review. Automating this step saves time on large datasets and reduces human error. Start with a safe, test script that operates on a copy of your data, then progressively add features like multi-column keys, case-insensitive matching, and scheduled runs.

Pro tip: Use triggers to run the script at regular intervals or when a sheet changes, but ensure safeguards (like creating backups) are in place before enabling automation.

Validation, edge cases, and best practices

After deduplication, validate that critical rows remain intact. Check that key fields align with expectations and that related data didn’t get detached during the process. Edge cases include handling blank rows, preserving the first occurrence in certain scenarios, and dealing with numeric vs. text mismatches. Best practices include: backing up data, testing methods on subsets, documenting the chosen approach, and keeping a clear log of changes. If your data sources update over time, consider implementing a repeatable workflow that can be executed with a single button click or script.

Tools & Materials

  • Computer with internet access(Chrome, Edge, or modern browser recommended)
  • Google account(Access to Google Sheets with editing permissions)
  • Sample dataset or test sheet(Preferably a copy of real data for practice)
  • Backup copy of the dataset(Save a separate file before deduplication)
  • Apps Script editor (optional)(If automating deduplication)

Steps

Estimated time: 45-90 minutes

  1. 1

    Back up your data

    Create a separate copy of the sheet or workbook to serve as a safety net. This allows you to revert quickly if the deduplication produces unexpected results.

    Tip: Use File > Make a copy to preserve the original dataset.
  2. 2

    Choose the dedup scope

    Decide which columns define a unique record. Include headers in your selection if your sheet uses them to keep the results aligned.

    Tip: If unsure, start with a small subset of columns and expand later.
  3. 3

    Remove duplicates using the built-in tool

    Apply Data cleanup Remove duplicates and select the key columns. Confirm headers and preview the affected rows before applying.

    Tip: Review a backup to confirm what will be removed.
  4. 4

    Create a multi-column key with formulas

    If duplicates span multiple columns, build a helper key by concatenating the key columns. Use this key to identify duplicates with a COUNTIF check.

    Tip: Normalize text first (trim spaces and standardize case) to reduce false matches.
  5. 5

    Highlight duplicates with conditional formatting

    Set up a rule to color duplicate rows or keys, enabling quick visual review before deletion.

    Tip: Filter by color to isolate duplicates easily.
  6. 6

    Automate with Google Apps Script (optional)

    Write a script to run dedup logic on a schedule or on sheet edits. Start with a safe test script on a copy, then extend.

    Tip: Add a confirmation log and backup step in the script.
  7. 7

    Validate results

    Run a final check to ensure no important data was removed and duplicates are truly eliminated according to your criteria.

    Tip: Compare counts before and after deduplication.
  8. 8

    Clean up and finalize

    Remove any helper columns and ensure the sheet is clean, consistent, and ready for use.

    Tip: Document the rules you used for future audits.
Pro Tip: Always back up your data before deduplication.
Warning: Deduplication can remove rows that seem identical but carry context in other columns—define your uniqueness carefully.
Note: Test methods on a small subset or a duplicate of the dataset before applying to the full sheet.
Pro Tip: Use a composite key for multi-column dedup to prevent false positives.
Pro Tip: Automate recurring dedup tasks with Apps Script for consistency and time savings.

FAQ

What is considered a duplicate in Google Sheets?

A duplicate is a row or record that has identical values in the chosen columns that define uniqueness. The definition depends on which columns you treat as keys. If all key fields match, the row is considered a duplicate.

A duplicate happens when the key fields match exactly. It’s defined by which columns you choose as the unique identifiers.

Which method is best for simple deduplication?

For simple cases, using Sheets' built-in Remove duplicates tool is usually fastest. Just select the range, define the key columns, and apply the tool after confirming the header option.

For simple deduplication, the built-in tool is quickest—select the range and run Remove duplicates.

Can deduplication affect data integrity?

Yes, removing duplicates can remove rows that carry unique context if you haven’t defined the keys carefully. Always review the results and keep a backup copy.

Yes, be careful. If you don’t define the keys well, you might lose important information.

How do I deduplicate across multiple columns?

Create a composite key by concatenating the values of multiple columns, then identify duplicates based on that key. This ensures you consider all relevant fields when determining uniqueness.

Use a concatenated key across the relevant columns to detect duplicates across multiple fields.

Is it possible to automate deduplication?

Yes. You can automate deduplication with Google Apps Script, running on a schedule or triggered by changes in the sheet. Always test on a copy first.

You can automate deduplication with Apps Script, but test on a copy first to be safe.

How can I verify that duplicates are truly removed?

After deduplication, compare row counts with the backup and scan key columns to ensure only duplicates were removed. Use filters or pivot tables to confirm data integrity.

Check row counts against the backup and review key columns to confirm accuracy.

Watch Video

The Essentials

  • Back up your data before deduplication
  • Define exact uniqueness criteria across columns
  • Use built-in tools for simple cases and formulas for complex ones
  • Review results with visual aids like conditional formatting
  • Consider automation for frequent deduplication workflows
Process diagram showing steps to delete duplicates in Google Sheets
Process for removing duplicates in Google Sheets

Related Articles