BER Decode Performance Enhancement Techniques
There are a number of different things that can be done in application code to improve BER decode performance. These include adjusting memory allocation parameters, using compact code generation, and using decode fast copy.
By far, the biggest performance bottleneck when decoding ASN.1 messages is the allocation of memory from the heap. Each call to new or malloc is very expensive.
The decoding functions must allocate memory because the sizes of many of the variables that make up a message are not known at compile time. For example, an OCTET STRING that does not contain a size constraint can be an indeterminate number of bytes in length.
ASN1C does two things by default to reduce dynamic memory allocations and improve decoding performance:
- Uses static variables wherever it can. Any BIT STRING, OCTET STRING, character string, or SEQUENCE OF or SET OF construct that contains a size constraint will result in the generation of a static array of elements sized to the max constraint bound.
- Uses a special nibble-allocation algorithm for allocating dynamic memory. This algorithm allocates memory in large blocks and them splits up these blocks on subsequent memory allocation requests. This results in fewer calls to the kernel to get memory. The downside is that one request for a few bytes of memory can result in a large block being allocated.
Common run-time functions are available for controlling the memory allocation process. First, the default size of a memory block as allocated by the nibble-allocation algorithm can be changed. By default, this value is set to 4K bytes. The run-time function rtMemSetDefBlkSize can be called to change this size. This takes a single argument - the value to which the size should be changed.
It is also possible to change the underlying functions called from within the memory management abstraction layer to obtain or free heap memory. By default, the standard C malloc, realloc, and free functions are used. These can be changed by calling the rtMemSetAllocFuncs function. This function takes as arguments function pointers to the allocate, reallocate, and free functions to be used in place of the standard C functions.
Another run-time memory management function that can improve performance is rtMemReset. This function is useful when decoding messages in a loop. It is used instead of rtMemFree at the bottom of the loop to make dynamic memory available for decoding the next message. The difference is that rtMemReset does not actually free the dynamic memory. It instead just resets the internal memory management parameters so that memory already allocated can be reused. Therefore, all the memory required to handle message decoding is normally allocated within the first few passes of the loop. From that point on, that memory is reused thereby making dynamic memory allocation a negligent issue in the overall performance of the decoder.
A more detailed explanation of these functions and other memory management functions can be found in the C/C++Common Run-Time Library Reference Manual.
Using the compact code generation option (-compact) and lax validation option (-lax) can also improve decoding performance.
The -compact option causes code to be generated that contains no diagnostic or error trace messages. In addition, some status checks and other non-critical code are removed providing a slightly less robust but faster code base.
Performance intensive applications should also be sure to link with the compact version of the base run-time libraries. These libraries can be found in the lib_opt (for optimized) subdirectory. These run-time libraries also have all diagnostics and error trace messages removed as well as some non-critical status checks.
"Fast Copy" is a special run-time flag that can be set for the decoder that can substantially reduce the number of copy operations that need to be done to decode a message. The copy operations are reduced by taking advantage of the fact that the data contents of some ASN.1 types already exist in decoded form in the message buffer. Therefore, there is no need to allocate memory for the data and then copy the data from the buffer into the allocated memory structure.
As an example of what fast copy does, consider a simple ASN.1 SEQUENCE consisting of an element a, an INTEGER and b, an OCTET STRING:
Simple ::= SEQUENCE { a INTEGER, b OCTET STRING }Assume an encoded value of this type contains a value of a = 123 (hex 7B) and b contains the hex octets 0x01 0x02 0x03. The generated variable for the OCTET STRING will contain a data pointer. So rather than allocate memory for this string and copy the data to it, fast copy will simply store a pointer directly to the data in the buffer:
The pointer stored in the data structure points directly at data in the message buffer. No memory allocation or copy is done.
The user must keep in mind that if this technique is used, the message buffer containing the decoded message must be available as long as the type variable containing the decoded data is in use. This will not work in a producer-consumer threading model where one thread is decoding messages and the next thread is processing the contents. The producer thread will overwrite the buffer contents and therefore data referenced in the decoded message type variable that the consumer is processing.
Objective Systems, Inc.102 Pickering Way, Suite #506Exton, Pennsylvania 19341 http://www.obj-sys.com Phone: (484) 875-9841 Toll-free: (877) 307-6855 (US only) Fax: (484) 875-9830 info@obj-sys.com |