TOC PREV NEXT INDEX


Character String Types


XSD defines many kinds of character string types including string, normalizedString, and token. All of these XSD types are mapped to an OSXMLSTRING type. This internal type represents a UTF-8 character string. The definition of this type in osSysTypes.h is as follows:
typedef struct OSXMLSTRING {
   OSBOOL cdata;
   const OSUTF8CHAR* value;
} OSXMLSTRING;

The cdata member of this structure is a flag indicating whether or not the value is to be encoded as an XML CDATA section. The value member is a pointer to the string to be encoded. The underlying C type for the OSUTF8CHAR type is unsigned char. This allows the entire UTF-8 character range to be represented as positive numbers.
For C++, the built-in OSXMLSTRING structure is extended to form an XML string class as follows:

class EXTERNRTX OSXMLStringClass : 
public OSXMLSTRING, public OSRTBaseType {
 public:
   /**
    * The default constructor creates an empty string.
    */
   OSXMLStringClass();
   ...
} ;

This makes the data members from the C type available in the C++ case as well. It also adds constructors and other methods to allow the member variables to be initialized and manipulated.
The general mapping is as follows:
XSD type:
<xsd:simpleType name="TypeName">
   <restriction base="xsd:string"/>
</xsd:simpleType>
 
Generated C code:
typedef OSXMLSTRING TypeName;
 
Generated C++ code:
class TypeName : public OSXMLStringClass {
   ...
} ;


In this case, xsd:string refers to the XSD string base type and all other types that are derived from it. For C, a variable of this type can be populated with a simple string literal cast to a const OSUTF8CHAR* variable as follows:

	TypeName strval;
	strval.cdata = FALSE;
	strval.value = (const OSUTF8CHAR*) "my string";

In the case of C++, the built-in assignment operator can be used to set the string value:
	strval = "my string";

This will set the cdata member to false as above and do a deep-copy of the text into the object.
String-based types may be further restricted through the use of facets such as length, minLength, maxLength, and pattern. These have no effect on the generated C or C++ type definitions. Constraint checks are added to the generated encoders and decoders to ensure values of the type are within the specified constraint bounds.

Copyright © Objective Systems 2002-2007
This document may be distributed in any form, electronic or otherwise, provided that it is distributed in its entirety and that the copyright and this notice are included.

Objective Systems, Inc.

55 Dowlin Forge Road
Exton, Pennsylvania 19341
http://www.obj-sys.com
Phone: (484) 875-9841
Toll-free: (877) 307-6855 (US only)
Fax: (484) 875-9830
info@obj-sys.com

TOC PREV NEXT INDEX