Substring in Google Sheets: A Practical Extraction Guide

Master substring extraction in Google Sheets using TEXTBEFORE, TEXTAFTER, REGEXEXTRACT, and related functions with practical, step-by-step examples and best practices for parsing strings reliably.

How To Sheets
How To Sheets Team
·5 min read
Substring in Sheets - How To Sheets
Quick AnswerDefinition

Substring extraction in Google Sheets is achieved with built-in text functions such as TEXTBEFORE, TEXTAFTER, LEFT, RIGHT, MID, and REGEXEXTRACT. These functions let you pull a portion of a string based on delimiters, positions, or patterns. This guide covers practical examples, edge cases, and best practices for reliable substring operations. According to How To Sheets, mastering these patterns reduces manual cleanup and errors in data parsing.

What substring means in Google Sheets and why it matters\n\nIn data work, a substring is any contiguous portion of a string. Substrings help you extract identifiers, codes, or labels embedded in longer text fields. Google Sheets provides a rich toolbox for substring tasks, from simple left-right extractions to advanced pattern matching with regular expressions. When you know the exact delimiter or pattern, you can automate parsing across entire columns, saving hours of manual cleanup. The How To Sheets team has found that teams often reach for TEXTBEFORE/TEXTAFTER first, then layer REGEXEXTRACT for more complex rules. By understanding the trade-offs, you can choose the most robust approach for your dataset. The aim here is to build reusable patterns that work across rows and columns, not just a single cell.\n\nexcel\n=TEXTBEFORE(A2, "-") // before the first dash\n=TEXTAFTER(A2, "-") // after the first dash\n\n\nBoth functions handle straightforward delimited substrings well, while REGEXEXTRACT covers more flexible patterns. \n

descriptionBLOCKSREADMEED

wordCountCappedNoteForBlock1":null},

Core functions for substring extraction in Sheets\n\nSheets exposes several core functions for substring math. LEFT and RIGHT grab a fixed number of characters, MID returns a substring from a starting position, and FIND/SEARCH locate delimiters. TEXTBEFORE and TEXTAFTER are superb for delimiter-based splits, while REGEXEXTRACT and REGEXREPLACE unlock pattern-based parsing. For reliability, combine these with IFERROR to handle missing substrings and with ARRAYFORMULA for batch processing.\n\nexcel\n=LEFT(B2, 4) // first four characters\n=RIGHT(B2, 3) // last three characters\n=MID(B2, 2, 5) // substring starting at 2 with length 5\n=REGEXEXTRACT(B2, "\\d+") // extract digits from the string\n\n\nexcel\n=TEXTBEFORE(B2, ", ") // before a comma-space delimiter\n=TEXTAFTER(B2, ":") // after a colon delimiter\n

Steps

Estimated time: 60-90 minutes

  1. 1

    Identify the substring target

    Scan the data column to determine the substring you need (before a delimiter, a fixed position, or a pattern). Decide which function best matches the rule. This upfront design prevents rework later.

    Tip: Write down the delimiter or pattern you expect to see.
  2. 2

    Pick the right function

    For fixed-delimiter splits use TEXTBEFORE/TEXTAFTER; for flexible patterns use REGEXEXTRACT. Keep a small test row to confirm behavior before applying to a whole column.

    Tip: Prefer TEXTBEFORE/TEXTAFTER for readability when patterns are stable.
  3. 3

    Build a simple extractor

    Create a single-cell formula to verify the approach. Compare results with manual parsing to ensure correctness.

    Tip: Test with both typical and edge-case examples.
  4. 4

    Handle missing substrings

    Wrap with IFERROR to return a sensible default when a substring isn’t found.

    Tip: Decide what should appear if the substring is absent (blank, original value, or a placeholder).
  5. 5

    Extend to a range

    Use ARRAYFORMULA to apply your extractor to a whole column efficiently.

    Tip: Avoid volatile operations inside array formulas to keep performance reasonable.
  6. 6

    Validate and document

    Cross-check random rows and document the rule so teammates can reuse the pattern.

    Tip: Add comments or a helper row with example outputs.
Pro Tip: When patterns vary, REGEXEXTRACT can be more scalable than multiple TEXTBEFORE/TEXTAFTER calls.
Warning: Regex patterns are case-sensitive by default; add (?i) for case-insensitive matches if needed.
Note: Combine REGEXREPLACE with REGEXEXTRACT to clean data before extracting substrings.

Prerequisites

Keyboard Shortcuts

ActionShortcut
CopyCopy selected cells to clipboardCtrl+C
Paste values onlyPaste without formatting to preserve values onlyCtrl++V
Auto-fill downExtend the formula or pattern down a columnCtrl+D
Find and replaceSearch and replace substrings across a rangeCtrl+H
Edit active cellEdit the current cell contentsF2

FAQ

What is the simplest way to extract text before a delimiter in Google Sheets?

TEXTBEFORE is the simplest approach when you know the delimiter. It returns the portion of text before the first occurrence of the delimiter. For example, =TEXTBEFORE(A2, "-") yields the segment before the dash.

Use TEXTBEFORE to grab text before a delimiter, which is quick and easy for clean data parsing.

How can I extract a dynamic pattern like digits or letters from a string?

REGEXEXTRACT is designed for dynamic patterns. Provide a regex to capture the desired portion, such as digits with \d+ or letters with [A-Za-z]+. If multiple matches exist, REGEXEXTRACT returns the first match by default.

REGEXEXTRACT lets you pull exactly the pattern you describe, which is great for variable text.

How do I handle cases where the substring doesn't exist?

Wrap your extraction in IFERROR or IFNA to return a fallback value, such as a blank or the original string. This prevents breaking downstream formulas when data is incomplete.

If the substring isn’t found, use IFERROR to keep your sheet clean.

Are there performance considerations when applying substrings to large datasets?

Yes. Prefer built-in functions over custom scripts for straightforward parsing. Avoid deeply nested formulas in every cell; use array formulas where appropriate and test on representative samples.

Built-in functions are usually faster and more reliable for large data.

Can TEXTBEFORE/TEXTAFTER handle multiple delimiters?

TEXTBEFORE/TEXTAFTER handle a single delimiter. For multiple delimiters, combine with REGEX or nest additional TEXTBEFORE/TEXTAFTER calls. This gives you flexibility for complex splits.

If you need multiple delimiters, regex is often the clean way to go.

Is there a function to extract from both ends of a string simultaneously?

No single function extracts from both ends. Combine LEFT/RIGHT or TEXTBEFORE/TEXTAFTER with REGEX to capture left and right portions as needed.

You’ll typically combine functions to get both ends of the string.

The Essentials

  • Master substring extraction with TEXTBEFORE/TEXTAFTER
  • Use REGEXEXTRACT for flexible, pattern-based parsing
  • Wrap formulas with IFERROR for robust datasets
  • Leverage ARRAYFORMULA to scale across ranges

Related Articles