A RetroSearch Logo

Home - News ( United States | United Kingdom | Italy | Germany ) - Football scores

Search Query:

Showing content from https://www.geeksforgeeks.org/dsa/remove-duplicate-words-from-sentence-using-regular-expression/ below:

Remove duplicate words from Sentence using Regular Expression

Remove duplicate words from Sentence using Regular Expression

Last Updated : 12 Jul, 2025

Given a string str which represents a sentence, the task is to remove the duplicate words from sentences using regular Expression in Programming Languages like C++, Java, C#, Python, etc.

Examples of Remove Duplicate Words from Sentences

Input: str = "Good bye bye world world" 
Output: Good bye world 
Explanation: We remove the second occurrence of bye and world from Good bye bye world world

Input: str = "Ram went went to to to his home" 
Output: Ram went to his home 
Explanation: We remove the second occurrence of went and the second and third occurrences of to from Ram went went to to to his home.

Input: str = "Hello hello world world" 
Output: Hello world 
Explanation: We remove the second occurrence of hello and world from Hello hello world world. 
 

Approach

1. Get the sentence.
2. Form a regular expression to remove duplicate words from sentences. 

regex = "\\b(\\w+)(?:\\W+\\1\\b)+";

The details of the above regular expression can be understood as: 

3. Match the sentence with the Regex. In Java, this can be done using Pattern.matcher(). 
4. return the modified sentence.

Below is the implementation of the above approach:

C++
// C++ program to remove duplicate words
// using Regular Expression or ReGex.
#include <iostream>
#include <regex>
using namespace std;

// Function to validate the sentence
// and remove the duplicate words
string removeDuplicateWords(string s)
{

  // Regex to matching repeated words.
  const regex pattern("\\b(\\w+)(?:\\W+\\1\\b)+", regex_constants::icase);

  string answer = s;
  for (auto it = sregex_iterator(s.begin(), s.end(), pattern);
       it != sregex_iterator(); it++)
  {
      // flag type for determining the matching behavior
      // here it is for matches on 'string' objects
      smatch match;
      match = *it;
      answer.replace(answer.find(match.str(0)), match.str(0).length(), match.str(1));
  }

  return answer;
}

// Driver Code
int main()
{
  // Test Case: 1
  string str1
      = "Good bye bye world world";
  cout << removeDuplicateWords(str1) << endl;

  // Test Case: 2
  string str2
      = "Ram went went to to his home";
  cout << removeDuplicateWords(str2) << endl;

  // Test Case: 3
  string str3
      = "Hello hello world world";
  cout << removeDuplicateWords(str3) << endl;

  return 0;
}

// This code is contributed by yuvraj_chandra
Java
// Java program to remove duplicate words
// Using Regular Expression or ReGex.
import java.util.regex.Matcher;
import java.util.regex.Pattern;

// Driver Class
class GFG {
    // Function to validate the sentence
    // and remove the duplicate words
    public static String removeDuplicateWords(String input)
    {
        // Regex to matching repeated words.
        String regex = "\\b(\\w+)(?:\\W+\\1\\b)+";
        Pattern p = Pattern.compile(regex,Pattern.CASE_INSENSITIVE);

        // Pattern class contains matcher() method
        // to find matching between given sentence
        // and regular expression.
        Matcher m = p.matcher(input);

        // Check for subsequences of input
        // that match the compiled pattern
        while (m.find()) {
            input = input.replaceAll( m.group(), m.group(1));
        }
        return input;
    }

    // Driver code
    public static void main(String args[])
    {
        // Test Case: 1
        String str1 = "Good bye bye world world";
        System.out.println(removeDuplicateWords(str1));

        // Test Case: 2
        String str2 = "Ram went went to to his home";
        System.out.println(removeDuplicateWords(str2));

        // Test Case: 3
        String str3 = "Hello hello world world";
        System.out.println( removeDuplicateWords(str3));
    }
}
Python3
# Python program to remove duplicate words
# using Regular Expression or ReGex.
import re


# Function to validate the sentence
# and remove the duplicate words
def removeDuplicateWords(input):

    # Regex to matching repeated words
    regex = r'\b(\w+)(?:\W+\1\b)+'

    return re.sub(regex, r'\1', input, flags=re.IGNORECASE)


# Driver Code

# Test Case: 1
str1 = "Good bye bye world world"
print(removeDuplicateWords(str1))

# Test Case: 2
str2 = "Ram went went to to his home"
print(removeDuplicateWords(str2))

# Test Case: 3
str3 = "Hello hello world world"
print(removeDuplicateWords(str3))

# This code is contributed by yuvraj_chandra
C#
using System;
using System.Text.RegularExpressions;

class Program
{
    // Function to validate the sentence
    // and remove the duplicate words
    static string RemoveDuplicateWords(string s)
    {
        // Regex to matching repeated words.
        Regex pattern = new Regex(@"\b(\w+)(?:\W+\1\b)+", RegexOptions.IgnoreCase);

        string answer = s;
        MatchCollection matches = pattern.Matches(s);

        foreach (Match match in matches)
        {
            answer = answer.Replace(match.Groups[0].Value, match.Groups[1].Value);
        }

        return answer;
    }

    // Driver Code
    static void Main()
    {
        // Test Case: 1
        string str1 = "Good bye bye world world";
        Console.WriteLine(RemoveDuplicateWords(str1));

        // Test Case: 2
        string str2 = "Ram went went to to his home";
        Console.WriteLine(RemoveDuplicateWords(str2));

        // Test Case: 3
        string str3 = "Hello hello world world";
        Console.WriteLine(RemoveDuplicateWords(str3));
    }
}
JavaScript
// Function to remove duplicate words using Regular Expression
function removeDuplicateWords(input) {
    // Regular expression to match repeated words
    let regex = /\b(\w+)(?:\W+\1\b)+/gi;

    // Replace duplicate words with the first occurrence
    return input.replace(regex, '$1');
}

// Test cases
// Test Case: 1
let str1 = "Good bye bye world world";
console.log(removeDuplicateWords(str1));

// Test Case: 2
let str2 = "Ram went went to to his home";
console.log(removeDuplicateWords(str2));

// Test Case: 3
let str3 = "Hello hello world world";
console.log(removeDuplicateWords(str3));

Output
Good bye world
Ram went to his home
Hello world
Complexity of the above Programs

Time Complexity : O(n), where n is length of string
Auxiliary Space : O(1)



RetroSearch is an open source project built by @garambo | Open a GitHub Issue

Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo

HTML: 3.2 | Encoding: UTF-8 | Version: 0.7.4