Substring in excel

Learn substring in excel with practical examples, diagrams, and best practices. Covers excel, string, excel-formula development techniques with visual explanations.

Mastering Substring Operations in Excel: Extracting Text with Precision

Hero image for Substring in excel

Learn how to effectively extract specific parts of text strings in Excel using powerful functions like LEFT, RIGHT, MID, FIND, and SEARCH. This guide covers basic extractions to complex dynamic substring operations.

Extracting specific portions of text from a larger string, often referred to as 'substring' operations, is a fundamental skill in Excel for data cleaning, analysis, and reporting. Whether you need to pull out a product code, a domain name from a URL, or initials from a full name, Excel provides a robust set of functions to achieve this. This article will guide you through the most common and powerful methods for substring extraction, from simple fixed-length extractions to dynamic extractions based on delimiters.

Basic Substring Functions: LEFT, RIGHT, and MID

Excel offers three primary functions for extracting substrings based on their position and length. These are straightforward to use when you know the exact number of characters you need or their starting position.

LEFT Function

The LEFT function extracts a specified number of characters from the beginning (left side) of a text string.

Syntax: LEFT(text, [num_chars])

  • text: The original text string from which you want to extract characters.
  • num_chars: (Optional) The number of characters you want to extract. If omitted, it defaults to 1.
=LEFT("Apple Banana", 5)

Extracts the first 5 characters from "Apple Banana".

RIGHT Function

The RIGHT function extracts a specified number of characters from the end (right side) of a text string.

Syntax: RIGHT(text, [num_chars])

  • text: The original text string.
  • num_chars: (Optional) The number of characters to extract from the right. Defaults to 1.
=RIGHT("Apple Banana", 6)

Extracts the last 6 characters from "Apple Banana".

MID Function

The MID function extracts a specified number of characters from the middle of a text string, starting at a position you specify.

Syntax: MID(text, start_num, num_chars)

  • text: The original text string.
  • start_num: The position of the first character you want to extract. The first character in text is position 1.
  • num_chars: The number of characters you want to extract.
=MID("Apple Banana", 7, 6)

Extracts 6 characters from "Apple Banana", starting at the 7th character.

While LEFT, RIGHT, and MID are great for fixed-length extractions, real-world data often requires more dynamic solutions. This is where FIND and SEARCH come into play. These functions help locate the position of a specific character or substring within a larger text, allowing you to make your substring formulas adaptable.

Both FIND and SEARCH return the starting position of one text string within another. The key difference is case sensitivity and wildcard support:

  • FIND(find_text, within_text, [start_num]): Case-sensitive, does not support wildcards.
  • SEARCH(find_text, within_text, [start_num]): Case-insensitive, supports wildcards (* for any sequence of characters, ? for any single character).

For most substring operations where you're looking for a delimiter (like a space, comma, or hyphen), SEARCH is often preferred due to its case-insensitivity, making formulas more robust.

=FIND("a", "Apple Banana")
=SEARCH("a", "Apple Banana")

FIND returns 6 (first lowercase 'a'), SEARCH returns 1 (first 'A' or 'a').

Combining Functions for Advanced Extractions

The true power of Excel substring operations comes from combining LEFT, RIGHT, and MID with FIND or SEARCH (and sometimes LEN for total length) to create dynamic formulas that adapt to varying text lengths and positions of delimiters.

Example 1: Extracting First Name from 'Firstname Lastname'

Let's say you have a full name in cell A1 (e.g., "John Doe") and you want to extract "John". You need to find the position of the space and then use LEFT.

=LEFT(A1, SEARCH(" ", A1) - 1)

Extracts text before the first space. SEARCH(" ", A1) finds the space, and -1 excludes the space itself.

Example 2: Extracting Last Name from 'Firstname Lastname'

To get "Doe" from "John Doe" in cell A1, you need to find the space, then calculate how many characters are to the right of it. LEN helps determine the total length of the string.

=RIGHT(A1, LEN(A1) - SEARCH(" ", A1))

Extracts text after the first space. LEN(A1) is total length, SEARCH(" ", A1) is space position. Subtracting them gives characters after the space.

Example 3: Extracting Text Between Delimiters

Suppose you have a string like "Product-XYZ-Region" in A1 and you want to extract "XYZ". This requires MID combined with two SEARCH functions to find both the start and end delimiters.

=MID(A1, SEARCH("-", A1) + 1, SEARCH("-", A1, SEARCH("-", A1) + 1) - SEARCH("-", A1) - 1)

Extracts text between the first and second hyphens. The formula finds the first hyphen, then the second hyphen starting after the first, and calculates the length between them.

flowchart TD
    A["Original String (e.g., A1)"] --> B{"Find 1st Delimiter (SEARCH("-", A1))
    (Pos1)"};
    B --> C{"Find 2nd Delimiter (SEARCH("-", A1, Pos1 + 1))
    (Pos2)"};
    C --> D{"Calculate Start Position for MID (Pos1 + 1)"};
    C --> E{"Calculate Length for MID (Pos2 - Pos1 - 1)"};
    D & E --> F["MID(A1, StartPos, Length)"];
    F --> G["Extracted Substring"];

Workflow for extracting text between two delimiters using Excel functions.

Using TEXTBEFORE and TEXTAFTER (Excel 365)

For users with Excel 365, new functions TEXTBEFORE and TEXTAFTER simplify many common substring extraction tasks, making formulas much cleaner and easier to read.

TEXTBEFORE Function

Extracts text that occurs before a given delimiter.

Syntax: TEXTBEFORE(text, delimiter, [instance_num], [match_mode], [match_end], [if_not_found])

  • text: The text to search within.
  • delimiter: The text marking the point before which you want to extract.
  • instance_num: (Optional) Which instance of the delimiter to use (e.g., 1st, 2nd). Defaults to 1.
  • Other arguments are for advanced matching.
=TEXTBEFORE("John Doe", " ")

Extracts "John" from "John Doe".

TEXTAFTER Function

Extracts text that occurs after a given delimiter.

Syntax: TEXTAFTER(text, delimiter, [instance_num], [match_mode], [match_end], [if_not_found])

  • text: The text to search within.
  • delimiter: The text marking the point after which you want to extract.
  • instance_num: (Optional) Which instance of the delimiter to use. Defaults to 1.
=TEXTAFTER("John Doe", " ")

Extracts "Doe" from "John Doe".

These new functions significantly reduce the complexity of formulas, especially when dealing with multiple delimiters or specific instances of a delimiter.