|
|
 |
 |
|
UTF-8 String FunctionsThe UTF-8 string functions handle string operations on UTF-8 encoded strings.
More...
|
Functions |
| long | rtxUTF8ToUnicode (OSCTXT *pctxt, const OSUTF8CHAR *inbuf, OSUNICHAR *outbuf, size_t outbufsiz) |
| | This function converts a UTF-8 string to a Unicode string (UTF-16).
|
| int | rtxValidateUTF8 (OSCTXT *pctxt, const OSUTF8CHAR *inbuf) |
| | This function will validate a UTF-8 encoded string to ensure that it is encoded correctly.
|
| size_t | rtxUTF8Len (const OSUTF8CHAR *inbuf) |
| | This function will return the length (in characters) of a null-terminated UTF-8 encoded string.
|
| size_t | rtxUTF8LenBytes (const OSUTF8CHAR *inbuf) |
| | This function will return the length (in bytes) of a null-terminated UTF-8 encoded string.
|
| int | rtxUTF8CharSize (OS32BITCHAR wc) |
| | This function will return the number of bytes needed to encode the given 32-bit universal character value as a UTF-8 character.
|
| int | rtxUTF8EncodeChar (OS32BITCHAR wc, OSOCTET *buf, size_t bufsiz) |
| | This function will convert a wide character into an encoded UTF-8 character byte string.
|
| int | rtxUTF8DecodeChar (OSCTXT *pctxt, const OSUTF8CHAR *pinbuf, int *pInsize) |
| | This function will convert an encoded UTF-8 character byte string into a wide character value.
|
| OS32BITCHAR | rtxUTF8CharToWC (const OSUTF8CHAR *buf, OSUINT32 *len) |
| | Thia function will convert a UTF-8 encoded character value into a wide character.
|
| OSUTF8CHAR * | rtxUTF8StrChr (OSUTF8CHAR *utf8str, OS32BITCHAR utf8char) |
| | This function finds a character in the given UTF-8 character string.
|
| OSUTF8CHAR * | rtxUTF8Strdup (OSCTXT *pctxt, const OSUTF8CHAR *utf8str) |
| | This function creates a duplicate copy of the given UTF-8 character string.
|
| OSUTF8CHAR * | rtxUTF8Strndup (OSCTXT *pctxt, const OSUTF8CHAR *utf8str, size_t nbytes) |
| | This function creates a duplicate copy of the given UTF-8 character string.
|
| OSBOOL | rtxUTF8StrEqual (const OSUTF8CHAR *utf8str1, const OSUTF8CHAR *utf8str2) |
| | This function compares two UTF-8 string values for equality.
|
| OSBOOL | rtxUTF8StrnEqual (const OSUTF8CHAR *utf8str1, const OSUTF8CHAR *utf8str2, size_t count) |
| | This function compares two UTF-8 string values for equality.
|
| int | rtxUTF8Strcmp (const OSUTF8CHAR *utf8str1, const OSUTF8CHAR *utf8str2) |
| | This function compares two UTF-8 character strings and returns a trinary result (equal, less than, greater than).
|
| int | rtxUTF8Strncmp (const OSUTF8CHAR *utf8str1, const OSUTF8CHAR *utf8str2, size_t count) |
| | This function compares two UTF-8 character strings and returns a trinary result (equal, less than, greater than).
|
| int | rtxUTF8StrToInt (const OSUTF8CHAR *utf8str, OSINT32 *pvalue) |
| | This function converts the given null-terminated UTF-8 string to an integer value.
|
| int | rtxUTF8StrnToInt (const OSUTF8CHAR *utf8str, size_t nbytes, OSINT32 *pvalue) |
| | This function converts the given part of UTF-8 string to an integer value.
|
Detailed Description
The UTF-8 string functions handle string operations on UTF-8 encoded strings.
This is the default character string data type used for encoded XML data. UTF-8 strings are represented in C as strings of unsigned characters (bytes) to cover the full range of possible single character encodings.
Function Documentation
|
|
This function will return the number of bytes needed to encode the given 32-bit universal character value as a UTF-8 character.
- Parameters:
-
| wc | 32-bit wide character value. |
- Returns:
- Number of bytes needed to encode as UTF-8.
|
|
|
Thia function will convert a UTF-8 encoded character value into a wide character.
- Parameters:
-
| buf | Pointer to UTF-8 character value. |
| len | Pointer to integer to receive decoded size (in bytes) of the UTF-8 character value sequence. |
- Returns:
- Converted wide character value.
|
| int rtxUTF8DecodeChar |
( |
OSCTXT * |
pctxt, |
|
|
const OSUTF8CHAR * |
pinbuf, |
|
|
int * |
pInsize |
|
) |
|
|
|
|
This function will convert an encoded UTF-8 character byte string into a wide character value.
- Parameters:
-
| pctxt | A pointer to a context structure. |
| pinbuf | Pointer to UTF-8 byte sequence to be decoded. |
| pInsize | Number of bytes that were consumed (i.e. size of the character). |
- Returns:
- 32-bit wide character value.
|
|
|
This function will convert a wide character into an encoded UTF-8 character byte string.
- Parameters:
-
| wc | 32-bit wide character value. |
| buf | Buffer to receive encoded UTF-8 character value. |
| bufsiz | Size of the buffer ot receive the encoded value. |
- Returns:
- Completion status of operation:
- 0 = success,
- negative return value is error.
|
|
|
This function will return the length (in characters) of a null-terminated UTF-8 encoded string.
- Parameters:
-
| inbuf | A pointer to the null-terminated UTF-8 encoded string. |
- Returns:
- Number of characters in string. Note that this may be different than the number of bytes as UTF-8 characters can span multiple-bytes.
|
| size_t rtxUTF8LenBytes |
( |
const OSUTF8CHAR * |
inbuf |
) |
|
|
|
|
This function will return the length (in bytes) of a null-terminated UTF-8 encoded string.
- Parameters:
-
| inbuf | A pointer to the null-terminated UTF-8 encoded string. |
- Returns:
- Number of bytes in the string.
|
|
|
This function finds a character in the given UTF-8 character string.
It is similar to the C strchr function.
- Parameters:
-
| utf8str | Null-terminated UTF-8 string to be searched. |
| utf8char | 32-bit Unicode character to find. |
- Returns:
- Pointer to to the first occurrence of character in string, or NULL if character is not found.
|
|
|
This function compares two UTF-8 character strings and returns a trinary result (equal, less than, greater than).
It is similar to the C strcmp function.
- Parameters:
-
| utf8str1 | UTF-8 string to be compared. |
| utf8str2 | UTF-8 string to be compared. |
- Returns:
- -1 if utf8str1 is less than utf8str2, 0 if the two string are equal, and +1 if the utf8str1 is greater than utf8str2.
|
|
|
This function creates a duplicate copy of the given UTF-8 character string.
It is similar to the C strdup function. Memory for the duplicated string is allocated using the rtxMemAlloc function.
- Parameters:
-
| pctxt | A pointer to a context structure. |
| utf8str | Null-terminated UTF-8 string to be duplicated. |
- Returns:
- Pointer to duplicated string value.
|
|
|
This function compares two UTF-8 string values for equality.
- Parameters:
-
| utf8str1 | UTF-8 string to be compared. |
| utf8str2 | UTF-8 string to be compared. |
- Returns:
- TRUE if equal, FALSE if not.
|
|
|
This function compares two UTF-8 character strings and returns a trinary result (equal, less than, greater than).
In this case, a maximum count of the number of bytes to compare can be specified. It is similar to the C strncmp function.
- Parameters:
-
| utf8str1 | UTF-8 string to be compared. |
| utf8str2 | UTF-8 string to be compared. |
| count | Number of bytes to compare. |
- Returns:
- -1 if utf8str1 is less than utf8str2, 0 if the two string are equal, and +1 if the utf8str1 is greater than utf8str2.
|
|
|
This function creates a duplicate copy of the given UTF-8 character string.
It is similar to the rtxUTF8Strdup function except that it allows the number of bytes to convert to be specified. Memory for the duplicated string is allocated using the rtxMemAlloc function.
- Parameters:
-
| pctxt | A pointer to a context structure. |
| utf8str | UTF-8 string to be duplicated. |
| nbytes | Number of bytes from utf8str to duplicate. |
- Returns:
- Pointer to duplicated string value.
|
|
|
This function compares two UTF-8 string values for equality.
It is similar to the rtxUTF8StrEqual function except that it allows the number of bytes to compare to be specified.
- Parameters:
-
| utf8str1 | UTF-8 string to be compared. |
| utf8str2 | UTF-8 string to be compared. |
| count | Number of bytes to compare. |
- Returns:
- TRUE if equal, FALSE if not.
|
|
|
This function converts the given part of UTF-8 string to an integer value.
It is assumed the string contains only numeric digits and whitespace. It is similar to the C atoi function except that the result is returned as a separate argument and an error status value is returned if the conversion cannot be performed successfully.
- Parameters:
-
| utf8str | UTF-8 string to convert. Not necessary to be null-terminated. |
| nbytes | Size in bytes of utf8Str. |
| pvalue | Pointer to integer to receive result |
- Returns:
- Status: 0 = OK, negative value = error
|
|
|
This function converts the given null-terminated UTF-8 string to an integer value.
It is assumed the string contains only numeric digits and whitespace. It is similar to the C atoi function except that the result is returned as a separate argument and an error status value is returned if the conversion cannot be performed successfully.
- Parameters:
-
| utf8str | Null-terminated UTF-8 string to convert |
| pvalue | Pointer to integer to receive result |
- Returns:
- Status: 0 = OK, negative value = error
|
|
|
This function converts a UTF-8 string to a Unicode string (UTF-16).
The Unicode string is stored as an array of 16-bit characters (unsigned short integers).
- Parameters:
-
| pctxt | A pointer to a context structure. |
| inbuf | UTF-8 string to convert. |
| outbuf | Output buffer to receive converted Unicode data. |
| outbufsiz | Size of the output buffer in bytes. |
- Returns:
- Completion status of operation:
- number of octets put in the output buffer,
- negative return value is error.
|
|
|
This function will validate a UTF-8 encoded string to ensure that it is encoded correctly.
- Parameters:
-
| pctxt | A pointer to a context structure. |
| inbuf | A pointer to the null-terminated UTF-8 encoded string. |
- Returns:
- Completion status of operation:
- 0 = success,
- negative return value is error.
|
|
This file was last modified on
8 Jan 2007. XBinder, Version 1.1.9 |