The Code Project View our sponsorsClick here for Dundas Consulting - experts in MFC, C++, TCP/IP and ASPAdvertise on the CodeProject
Home >> String >> Unedited Reader Contributions

STL Split String
By Paul J. Weiss

A function that will split an input string based on a string delimiter 
 Beginner
 VC 4-6, Win95-98, NT4, W2K, ATL, STL
 Posted 15 May 2001
 Updated 23 May 2001
Articles by this author
Send to a friend
Printer friendly version
[This is an unedited reader contribution] [Modify this article] [Delete this article]
Lounge New Articles Sign in Forums Contribute
Broken links? Email us!
2 users have rated this article. result:
3.5 out of 5.

Description:

Below is a function I created and have found extremely useful for splitting strings based on a particular delimiter. The implementation only requires STL which makes it easy to port to any OS that supports STL. The function is fairly lightweight although I haven't done extensive performance testing.

The delimiter can be n number of characters represented as a string. The parts of the string in between the delimiter are then put into a string vector. The class StringUtils contains one static function SplitString. The int returned is the number of delimiters found within the input string.

I used this utility mainly for parsing strings that were being passed across platform boundaries. Whether you are using raw sockets or middleware such as TIBCO® it is uncomplicated to pass string data. I found it more efficient to pass delimited string data verses repeated calls or messages. Another place I used this was in passing BSTRs back and forth between a Visual Basic Client and an ATL COM dll. It proved to be easier than passing a SAFEARRAY as an [in] or [out] parameter. This was also beneficial when I did not want the added overhead of MFC and hence could not use CString.

Implementation:

The SplitString function uses the STL string functions find and substr to iterate through the input string. The hardest part was figuring out how to get the substring of the input string based on the offsets of the delimiter, not forgetting to take into account the length of the delimiter. Another hurdle was making sure not to call substr with an offset greater than the length of the input string.

Header:

//-----------------------------------------------------------------------
//
//	File:		StringUtils.h
//
//	Purpose:	STL split string utility
//	Author:		Paul J. Weiss
//
//------------------------------------------------------------------------

#ifndef __STRINGUTILS_H_
#define __STRINGUTILS_H_

#include 
#include 

using namespace std;

class StringUtils
{

public:

	static int SplitString(const string& input, const string& delimiter, vector& results);

};

#endif

Source:

//-----------------------------------------------------------------------
//
//	File:		StringUtils.cpp
//
//	Purpose:	STL split string utility
//	Author:		Paul J. Weiss
//
//------------------------------------------------------------------------

#include "stdafx.h" // comment if not using precompiled headers in MVC++
#include "StringUtils.h"

int StringUtils::SplitString(const string& input, const string& delimiter, vector& results)
{
	int iPos = 0;
	int newPos = -1;
	int sizeS2 = delimiter.size();
	int isize = input.size();

	vector<int> positions;

	newPos = input.find (delimiter, 0);

	if( newPos < 0 ) { return 0; }

	int numFound = 0;

	while( newPos > iPos )
	{
		numFound++;
		positions.push_back(newPos);
		iPos = newPos;
		newPos = input.find (delimiter, iPos+sizeS2+1);
	}

	for( int i=0; i <= positions.size(); i++ )
	{
		string s;
		if( i == 0 ) { s = input.substr( i, positions[i] ); }
		int offset = positions[i-1] + sizeS2;
		if( offset < isize )
		{
			if( i == positions.size() )
			{
				s = input.substr(offset);
			}
			else if( i > 0 )
			{
				s = input.substr( positions[i-1] + sizeS2, positions[i] - positions[i-1] - sizeS2 );
			}
		}
		if( s.size() > 0 )
		{
			results.push_back(s);
		}
	}
	return numFound;
}

Usage:

//------------------------
// main.cpp
//
// compiler = VC++6.0
//------------------------

#include "stdafx.h" // you might have to add '#pragma warning(disable: 4786)' to this file
#include 
#include 
#include "StringUtils.h"

int main(int argc, char* argv[])
{
	// see Alice's Adventures in Wonderland by LC chapter VII for full context of quote
	string in("You might just as well say that I see what I eat is the same thing as I eat what I see");
	string delim("eat");

	vector<string> results;

	int num = StringUtils::SplitString(in, delim, results);

	printf("input = %s\ndelimiter = %s\n", in.c_str(), delim.c_str());
	printf("Number of %s found = %d\n", delim.c_str(), num);

	for( int i=0; i < results.size(); i++ )
	{
		printf("substring %d = '%s'\n", i+1, results[i].c_str());
	}
	return 0;
}

Output:

input = You might just as well say that I see what I eat is the same thing as I eat what I see
delimiter = eat
Number of eat found = 2
substring 1 = 'You might just as well say that I see what I '
substring 2 = ' is the same thing as I '
substring 3 = ' what I see'

Comments:

Hope you find this as useful as I did. Feel free to let me know of any bugs or enhancements. Enjoy ;)

About Paul J. Weiss

Paul J. Weiss graduated Fairfield University with a Bachelor of Science degree in Neuroscience. He currently is a developer at a large investment bank in Manhattan. His favorite languages are C++, Java, and Perl in that order.

Click here to visit Paul J. Weiss's homepage.

[Top] Sign in to vote for this article:     PoorExcellent  
Hint: For improved responsiveness, use Internet Explorer 4 (or above) with Javascript enabled, choose 'Use DHTML' from the View dropdown and hit 'Set Options'.
 Keyword Filter
 View   Per page   Messages since
New threadMessages 1 to 5 of 5 (Total: 5)First Prev Next Last
Subject 
Author 
Date 
  minor change
Ben Burnett 21:27 23 May 01 
  Namespaces
James Curran 8:28 21 May 01 
  Re: Namespaces
Jim Barry 20:04 28 May 01 
  vector without template params
David Scambler 23:15 16 May 01 
  Re: vector without template params
Paul J. Weiss 23:31 16 May 01 
Last Visit: 12:00 Friday 1st January, 1999First Prev Next Last

Home >> String >> Unedited Reader Contributions
last updated 23 May 2001
Article content copyright Paul J. Weiss, 2001
everything else © CodeProject, 1999-2001.
The Code Project View our sponsorsClick here for Dundas Software's TCP/IP Development KitAdvertise on the CodeProject