ULAPI  8.0
Public Types | Public Member Functions | Friends | List of all members
ULString Class Reference

A string wrapper class for use in the Ultralingua API. More...

#include <ulstring.h>

Public Types

enum  { UTF8, UTF16BE, UTF16LE }
 

Public Member Functions

 ULString ()
 
 ULString (const ULString &s)
 
 ULString (const char *s)
 
 ULString (ulchar ch)
 
virtual ~ULString ()
 
void clear ()
 
ULStringoperator= (const ULString &s)
 
ULStringoperator= (const char *s)
 
ULStringoperator+= (const ULString &s)
 
ULStringoperator+= (const char *s)
 
ULStringoperator+= (ulchar ch)
 
ulchar operator[] (int n) const
 
 operator const char * ()
 
uluint32 length () const
 
void setAt (uluint32 index, ulchar value)
 
int getEncoding () const
 
void setEncoding (const char *encodingName)
 
void setEncoding (int newEncoding)
 
const char * getBuffer ()
 
void fromUTF8 (const char *utf8Buffer)
 
ULStringIterator begin () const
 
ULStringReverseIterator rbegin () const
 
ULStringIterator end () const
 
ULStringReverseIterator rend () const
 
bool contains (ulchar ch) const
 
bool contains (const char *s) const
 
bool contains (const ULString &s) const
 
ULStringIterator find (ulchar ch) const
 
ULStringIterator find (const char *s) const
 
ULStringIterator find (const ULString &s) const
 
ULStringIterator rfind (ulchar ch) const
 
ULStringIterator rfind (const char *s) const
 
ULStringIterator rfind (const ULString &s) const
 
bool startsWith (const ULString &head) const
 
bool startsWith (const ULString &head, ULCollator *collator, bool ignoreCase, bool ignoreAccents) const
 
bool endsWith (const ULString &tail) const
 
bool endsWith (const ULString &tail, ULLanguage language, bool ignoreCase, bool ignoreAccents) const
 
void replace (const ULString &textToReplace, const ULString &replacementText)
 
void replace (ulchar charToReplace, const ULString &replacementText)
 
void replace (const ULStringIterator &iterator, uluint32 nCharsToReplace, const ULString &replacementText)
 
void replace (uluint32 index, uluint32 nCharsToReplace, const ULString &replacementText)
 
void erase (ULStringIterator &iterator, uluint32 nCharsToErase=1)
 
void erase (uluint32 start, uluint32 nCharsToErase=1)
 
void insert (const ULStringIterator &iterator, const ULString &s)
 
void insert (ulint32 insertionIndex, const ULString &s)
 
ULString substr (ulint32 start, ulint32 nChars) const
 
ULString substr (const ULStringIterator &start, const ULStringIterator &end) const
 
ulint32 compare (const ULString &s, const ULLanguage &language) const
 
ulint32 compare (const ULString &s, const ULLanguage &language, bool ignoreCase, bool ignoreAccents) const
 
bool equals (const ULString &s, bool ignoreCase, bool ignoreAccents) const
 
uluint32 hash (uluint32 tableSize) const
 
ULStringreverse ()
 
ULStringtoLower ()
 
ULStringtoLower (const ULLanguage &language)
 
ULStringtoUpper ()
 
ULStringtoUpper (const ULLanguage &language)
 
ULStringtoBase ()
 
ULStringtoBase (const ULLanguage &language)
 
int getIntegerValue () const
 
void appendInteger (int n, uluint32 base=10)
 
void split (const ULString &s, ULList< ULString > &stringList, bool stripParts=false) const
 
void strip (const char *charactersToRemove=0)
 
ULString join (const ULList< ULString > &stringList) const
 

Friends

class ULStringIterator
 
class ULStringReverseIterator
 
class ULCollator
 
bool operator== (const ULString &a, const ULString &b)
 
bool operator< (const ULString &a, const ULString &b)
 

Detailed Description

A string wrapper class for use in the Ultralingua API.

This wrapper will enable us to swap string libraries in and out of ULAPI depending on the constraints of a particular application or platform.

If UL_USING_ICU is #defined, then an implementation based on ICU (http://site.icu-project.org/) will be used. Otherwise, a "legacy" implementation based on ULAPI version 7 will be used.

Member Enumeration Documentation

anonymous enum
Enumerator:
UTF8 
UTF16BE 
UTF16LE 

Constructor & Destructor Documentation

ULString::ULString ( )

Default constructor.

ULString::ULString ( const ULString s)

Copy constructor.

ULString::ULString ( const char *  s)
Parameters
[in]sThe string to be stored in this ULString object.

Strings stored as char *'s need to be in an escaped form using only "invariant" characters (essentially printable ASCII). To represent the French word for "to be" as a char * literal, for example, you would use "\\u00EAtre", where 00EA is the hexadecimal codepoint for a lower-case e with a circumflex. The double backslash is necessary because the escape sequence will be interpreted at run-time, not at compile-time.

Non-invariant characters can be represented using either the or the escape sequence (again, with the backslash doubled).

For more details on ICU string literals, see http://userguide.icu-project.org/strings, especially the subsection entitled "Compiler-dependent definitions.".

Note that we are using the ICU string literals even in non-ICU implementations of ULString, to preserve the ULString interfaces.

ULString::ULString ( ulchar  ch)

Constructor.

Parameters
[in]chThe character with which to initialize this string.
ULString::~ULString ( )
virtual

Destructor.

Member Function Documentation

void ULString::appendInteger ( int  n,
uluint32  base = 10 
)

Appends to this string characters representing the specified integer in the specified base.

Parameters
[in]nThe integer to append to this string.
[in]baseThe base in which to represent the integer.
ULStringIterator ULString::begin ( ) const
Returns
an iterator pointing to the first character in this string (or to the end if this string is empty).
void ULString::clear ( )

Sets this string to the empty string.

ulint32 ULString::compare ( const ULString s,
const ULLanguage language 
) const

Language-dependent collation function.

Returns
-1, 0, or 1 depending on whether this string is strictly less than, equal to, or greater than the specified string using a collation appropriate for the specified language.
Parameters
[in]sThe string to compare with this string.
[in]languageThe language in which to compare the strings.
ulint32 ULString::compare ( const ULString s,
const ULLanguage language,
bool  ignoreCase,
bool  ignoreAccents 
) const

Language-dependent collation function, with control over case and and accent sensitivity.

Returns
-1, 0, or 1 depending on whether this string is strictly less than, equal to, or greater than the specified string using a collation appropriate for the specified language (and dependent on the ignoreCase and ignoreAccents parameters).
Parameters
[in]sThe string to compare with this string.
[in]languageThe language in which to compare the strings.
[in]ignoreCaseTrue if the comparison should be case-insensitive.
[in]ignoreAccentsTrue if the comparison should be accent-insensitive.
bool ULString::contains ( ulchar  ch) const
Returns
true if this string contains the specified character at least once, and false otherwise.
Parameters
[in]chthe character to be found.
bool ULString::contains ( const char *  s) const
Returns
true if this string contains the specified string at least once as a substring, and false otherwise.

See ULString::ULString(const char *) for information on string literals.

Parameters
[in]sthe substring to be found.
bool ULString::contains ( const ULString s) const
Returns
true if this string contains the specified string at least once as a substring, and false otherwise.
Parameters
[in]sthe substring to be found.
ULStringIterator ULString::end ( ) const
Returns
an iterator pointing to the end of this string. Note that the "end" of the string is not the last character in the string, but rather a position immediately following the last character. Iterators that point to the end of the string must not be dereferenced.
bool ULString::endsWith ( const ULString tail) const
Returns
true is this string ends with the specified characters.
Parameters
[in]tailThe characters to match with the end of this string.
bool ULString::endsWith ( const ULString tail,
ULLanguage  language,
bool  ignoreCase,
bool  ignoreAccents 
) const
Returns
true is this string ends with the specified characters.
Parameters
[in]tailThe characters to match with the end of this string.
[in]ignoreCaseIf true, the comparison between head and the end of this string will be case-insensitive.
[in]ignoreAccentsIf true, the comparison is done both case- and accent-insensitively.
bool ULString::equals ( const ULString s,
bool  ignoreCase,
bool  ignoreAccents 
) const
Returns
true if this string and s are equal, taking into account the specified case- and accent-sensitivity. This method's implementation is designed to avoid string copying or other expensive memory allocations.
Parameters
[in]sthe string with which to compare this string.
[in]ignoreCasetrue if the strings should be compared case-insensitively.
[in]ignoreAccentstrue if the strings should be compared accent-insensitively.
void ULString::erase ( ULStringIterator iterator,
uluint32  nCharsToErase = 1 
)

Removes characters from this string.

Parameters
[in,out]iteratorThis parameter initially points to the first character to be erased. After the erasure, it points to the character that immediately followed the last erased character (or the end of the string, if the final character was erased).
[in]nCharsToEraseThe number of characters to erase (default=1).
void ULString::erase ( uluint32  start,
uluint32  nCharsToErase = 1 
)

Removes characters from this string.

Parameters
[in]startThe index of the first character to be erased.
[in]nCharsToEraseThe number of characters to erase.
ULStringIterator ULString::find ( ulchar  ch) const
Returns
an iterator pointing to the first occurrence of the specified character in this string, or to the end of this string if the character is not found.
Parameters
[in]chThe character to be found.
ULStringIterator ULString::find ( const char *  s) const
Returns
an iterator pointing to the first occurrence of the specified literal in this string, or to the end of this string if the literal is not found.

See ULString::ULString(const char *) for information on string literals.

Parameters
[in]sThe substring to be found.
ULStringIterator ULString::find ( const ULString s) const
Returns
an iterator pointing to the first occurrence of the specified string in this string, or to the end of this string if the specified string is not found.
Parameters
[in]sThe substring to be found.
void ULString::fromUTF8 ( const char *  utf8Buffer)

TODO this doc plus more of this type of method.

const char * ULString::getBuffer ( )
Returns
a pointer to a character buffer holding an encoded version of this string, using this string's current encoding.

Note that getBuffer is a non-const method to enable conversion to be done lazily. For the most part, a string will undergo many changes before it needs to be written to disk or viewed on screen. Performing a conversion from ULString's native representation into the encoded form at every change is thus a big time waste. The price we pay for better performance is a non-const getBuffer.

int ULString::getEncoding ( ) const
Returns
this string's current encoding. Possible values are enumerated in the ulstring.h header file, and include ULString::UTF8, ULString::UTF16BE (big endian), and ULString::UTF16LE (little endian).
int ULString::getIntegerValue ( ) const
Returns
the integer value of this string interpreted as a base-ten digit string.
Precondition
This string consists of only digit characters.
uluint32 ULString::hash ( uluint32  tableSize) const
Returns
a simple hash value for this string based on the specified hash table size. This function makes it easy to create a ULHashTable of ulstring objects.
Parameters
[in]tableSizeThe size of the hash table for which this hash value will be used.
void ULString::insert ( const ULStringIterator iterator,
const ULString s 
)

Inserts characters into this string.

Parameters
[in]iteratorAn iterator pointing to the character in front of which to insert the new characters (or to the end of this string, in which case the new characters will be appended to this string).
[in]sThe characters to insert into this string.
void ULString::insert ( ulint32  insertionIndex,
const ULString s 
)

Inserts characters into this string.

Parameters
[in]insertionIndexThe 0-based index of the character in front of which to insert the new characters. If -1, the characters will be appended to the end of this string.
[in]sThe characters to insert into this string.
ULString ULString::join ( const ULList< ULString > &  stringList) const
Returns
a concatenation of the strings in stringList, with adjacent strings separated by copies of this string. For example, if this string is "X" and stringList is ["aaa", "bbb", "ccc"], the returned string will be "aaaXbbbXccc". If stringList is empty, an empty string is returned. If stringList has a single string, then stringList[0] is returned.
Parameters
[in]stringListthe list to be joined.
uluint32 ULString::length ( ) const
Returns
the number of characters in this string.
ULString::operator const char * ( )
Returns
a const pointer to a buffer containing the UTF-8 encoding of this string.
ULString & ULString::operator+= ( const ULString s)

Appends the specified string to this ULString object.

Returns
a reference to this string.
Parameters
[in]sThe string to be appended to this ULString object.
ULString & ULString::operator+= ( const char *  s)

Appends the specified string literal to this ULString object.

Returns
a reference to this string.
Parameters
[in]sThe string to be appended to this ULString object. See ULString::ULString(const char *) for information on string literals.
ULString & ULString::operator+= ( ulchar  ch)

Appends the specified character to this ULString object.

Returns
a reference to this string.
Parameters
[in]chThe character to be appended to this ULString object.
ULString & ULString::operator= ( const ULString s)

Assignment operator.

ULString & ULString::operator= ( const char *  s)
Parameters
[in]sThe string to be copied into this ULString object. See ULString::ULString(const char *) for information on string literals.
ulchar ULString::operator[] ( int  n) const
Returns
the character found in this string at the specified index.
Parameters
[in]nThe index of the desired character. If n is negative, the character returned is the nth from the end (-1 means the last character, -2 the second to last, etc.).
Precondition
0 <= n < the length of this string or 0 > n >= -length..
ULStringReverseIterator ULString::rbegin ( ) const
Returns
a reverse iterator pointing to the last character in this string (or to the end if this string is empty).
ULStringReverseIterator ULString::rend ( ) const
Returns
a reverse iterator pointing to the reverse end of this string. Note that the "reverse end" of the string is not the first character in the string, but rather a position immediately preceding the first character. Iterators that point to the reverse end of the string must not be dereferenced.
void ULString::replace ( const ULString textToReplace,
const ULString replacementText 
)

Replaces all occurrences in this string of the specified substring with the specified replacement text.

Parameters
[in]textToReplaceThe substring to be replaced.
[in]replacementTextThe text with which to replace each occurrence of the substring.
void ULString::replace ( ulchar  charToReplace,
const ULString replacementText 
)

Replaces all occurrences in this string of the specified character with the specified replacement text.

Parameters
[in]charToReplaceThe substring to be replaced.
[in]replacementTextThe text with which to replace each occurrence of the substring.
void ULString::replace ( const ULStringIterator iterator,
uluint32  nCharsToReplace,
const ULString replacementText 
)

Replaces a substring of this string with the specified replacement text.

Parameters
[in]iteratorThe beginning of the substring to replace.
[in]nCharsToReplaceThe number of characters to replace.
[in]replacementTextThe text with which to replace the substring.
void ULString::replace ( uluint32  index,
uluint32  nCharsToReplace,
const ULString replacementText 
)

Replaces a substring of this string with the specified replacement text.

Parameters
[in]indexThe beginning of the substring to replace.
[in]nCharsToReplaceThe number of characters to replace.
[in]replacementTextThe text with which to replace the substring.
ULString & ULString::reverse ( )

Reverses this ULString in place.

Returns
a reference to this string.
ULStringIterator ULString::rfind ( ulchar  ch) const
Returns
an iterator pointing to the last occurrence of the specified character in this string, or to the end of this string if the character is not found.
Parameters
[in]chThe character to be found.
ULStringIterator ULString::rfind ( const char *  s) const
Returns
an iterator pointing to the last occurrence of the specified literal in this string, or to the end of this string if the literal is not found.

See ULString::ULString(const char *) for information on string literals.

Parameters
[in]sThe substring to be found.
ULStringIterator ULString::rfind ( const ULString s) const
Returns
a reverse iterator pointing to the last occurrence of the specified string in this string, or to the end of this string if the specified string is not found.
Parameters
[in]sThe substring to be found.
void ULString::setAt ( uluint32  index,
ulchar  value 
)

Sets the character at the specified index to the specified character.

Parameters
[in]indexThe index of the character to change.
[in]valueThe value to which to set the character.
Precondition
The index is a valid index for this string.
void ULString::setEncoding ( const char *  encodingName)

Sets the encoding for this string, to be used when getBuffer is called.

Parameters
[in]encodingNamethe name of the desired encoding. Possible values include "utf-8", "UTF-8", "utf-16be", "UTF-16BE", "utf-16le", and "UTF-16LE" (where "BE" and "LE" stand for "big endian" and "little endian", respectively).
void ULString::setEncoding ( int  newEncoding)

Sets the encoding for this string, to be used when getBuffer is called.

Parameters
[in]newEncodingthe desired encoding. Possible values are enumerated in the ulstring.h header file, and include ULString::UTF8, ULString::UTF16BE (big endian), and ULString::UTF16LE (little endian).
void ULString::split ( const ULString delimiter,
ULList< ULString > &  stringList,
bool  stripParts = false 
) const

Creates a list of strings consisting of the substrings of this string between occurrences of the string s. For example, if this string is "abracadabra" and s is "a", then the resulting list is ("", "br", "c", "d", "br", "").

Parameters
[in]delimiterThe delimiter substring.
[out]stringListThe list of substrings obtained by the splitting.
[in]stripPartsTrue if the substrings should be stripped of white space on the left and right before being added to the string list.
bool ULString::startsWith ( const ULString head) const
Returns
true is this string starts with the specified characters.
Parameters
[in]headThe characters to match with the beginning of this string.
bool ULString::startsWith ( const ULString head,
ULCollator collator,
bool  ignoreCase,
bool  ignoreAccents 
) const
Returns
true is this string starts with the specified characters.
Parameters
[in]headThe characters to match with the beginning of this string.
[in]collatorA collator with which to compare the head to this string.
[in]ignoreCaseIf true, the comparison between head and the beginning of this string will be case-insensitive.
[in]ignoreAccentsIf true, the comparison is done both case- and accent-insensitively.
void ULString::strip ( const char *  charactersToRemove = 0)

Removes zero or more copies of the specified characters from both ends of this string. For example, if charactersToRemove = ",.!?:; ", then strip will remove punctuation and spaces from both ends of the string.

Parameters
[in]charactersToRemovethe characters to strip. If this pointer is null, then strip removes white space (spaces, tabs, linefeeds, and carriage returns).
ULString ULString::substr ( ulint32  start,
ulint32  nChars 
) const
Returns
the substring of this string consisting of the nChars characters starting at index start.
Parameters
[in]startThe index of the first character of the desired substring.
[in]nCharsThe number of characters in the substring.
ULString ULString::substr ( const ULStringIterator start,
const ULStringIterator end 
) const
Returns
the substring of this string starting at the character pointed to by the iterator start, and ending at the character immediately preceding the iterator end.
Parameters
[in]startAn iterator pointing to the first character of the desired substring.
[in]endAn iterator pointing to the position immediately following the end of the substring.
ULString & ULString::toBase ( )

Removes accents from this string using the default locale.

Returns
a reference to this string.
ULString & ULString::toBase ( const ULLanguage language)

Removes accents from this string using a locale appropriate for the specified language.

Returns
a reference to this string.
Parameters
[in]languageThe language.
ULString & ULString::toLower ( )

Changes this string to lower case using the default locale.

Returns
a reference to this string.
ULString & ULString::toLower ( const ULLanguage language)

Changes this string to lower case using a locale appropriate for the specified language.

Returns
a reference to this string.
Parameters
[in]languageThe language.
ULString & ULString::toUpper ( )

Changes this string to upper case using the default locale.

Returns
a reference to this string.
ULString & ULString::toUpper ( const ULLanguage language)

Changes this string to upper case using a locale appropriate for the specified language.

Returns
a reference to this string.
Parameters
[in]languageThe language.

Friends And Related Function Documentation

bool operator< ( const ULString a,
const ULString b 
)
friend
Returns
true if the first string is strictly less than the second string lexicographically by 16-bit character values.
Parameters
[in]aThe first string.
[in]sThe second string.
bool operator== ( const ULString a,
const ULString b 
)
friend
Returns
true if the first string is equal to the second string using byte for byte comparison.
Parameters
[in]aThe first string.
[in]sThe second string.
friend class ULCollator
friend
friend class ULStringIterator
friend
friend class ULStringReverseIterator
friend

The documentation for this class was generated from the following files: