Encdec

1. Encdec

These functions may be used to encode and decode C objects such as integers, floats, doubles, times, and internationalized strings to and from a wide variety of binary formats as they might appear in portable file formats or network messages. These encodings include 16, 34, and 64 bit big and little endian intergers, big and little endian IEEE754 float and double values, 6 time encodings, and the wide range of string encodings supported by libiconv. The functions are all designed to be ideal for in-situ decoding and encoding of complex formats.

The Encdec Java Class

See the src/Encdec.java file for equivalent methods in Java. Formats generated by these two implementations are compatible with two exceptions; 64 bit times encdoded using Java will be truncated to the 1 second resolution of the C time_t type and the Java methods do not provide string encoding/decoding methods because Java supports a wide variety of encodings natively. The "UTF-8" encoding is good for transferring strings between Java and C. Note the encoding identifier used with the String constructor and String.getBytes() method may need to be specified as "UTF8" without the hyphen. In-fact many of the identifiers are different so it will be necessary to look up the correct identifier in the Java i18n documentation.

1.1. The FLD macro

The FLD macro
Synopsis


  #include <encdec.h>
	
  unsigned int FLD(i, m);

Description
The FLD macro is used to decode bit-fields. It returns an integer value representing the value occupying the bits in mask m. If for example the input is 0xCBA98765 and the mask is 0x00FFFF00 a value of 0xA987 will be returned. With basic register optimizations this is equivalent to the expression (0xCBA98765 >> 8) & 0xFFFF. Masks can be complex. The mask 0x7F080 is equivalent to (i >> 7) & 0xFE1.

1.2. Integer functions

These functions should be used to encode and decode 16, 32, and 64 bit integers.

The enc_uint16be function
Synopsis


  #include <encdec.h>
  size_t enc_uint16be(uint16_t s, unsigned char *dst);

Description
Encode a 16 bit integer in big endian order into the memory at dst and return the number of bytes written which is always 2.

The enc_uint32be function
Synopsis


  #include <encdec.h>
  size_t enc_uint32be(uint32_t i, unsigned char *dst);

Description
Encode a 32 bit integer in big endian order into the memory at dst and return the number of bytes written which is always 4.

The enc_uint64be function
Synopsis


  #include <encdec.h>
  size_t enc_uint64be(uint64_t l, unsigned char *dst);

Description
Encode a 64 bit integer in big endian order into the memory at dst and return the number of bytes written which is always 8.

The enc_uint16le function
Synopsis


  #include <encdec.h>
  size_t enc_uint16le(uint16_t s, unsigned char *dst);

Description
Encode a 16 bit integer in little endian order into the memory at dst and return the number of bytes written which is always 2.

The enc_uint32le function
Synopsis


  #include <encdec.h>
  size_t enc_uint32le(uint32_t i, unsigned char *dst);

Description
Encode a 32 bit integer in little endian order into the memory at dst and return the number of bytes written which is always 4.

The enc_uint64le function
Synopsis


  #include <encdec.h>
  size_t enc_uint64le(uint64_t l, unsigned char *dst);

Description
Encode a 64 bit integer in little endian order into the memory at dst and return the number of bytes written which is always 8.

The dec_uint16be function
Synopsis


  #include <encdec.h>
  uint16_t dec_uint16be(const unsigned char *src);

Description
Return a 16 bit integer decoded in big endian byte order from 2 bytes of src memory.

The dec_uint32be function
Synopsis


  #include <encdec.h>
  uint32_t dec_uint32be(const unsigned char *src);

Description
Return a 32 bit integer decoded in big endian byte order from 4 bytes of src memory.

The dec_uint64be function
Synopsis


  #include <encdec.h>
  uint64_t dec_uint64be(const unsigned char *src);

Description
Return a 64 bit integer decoded in big endian byte order from 8 bytes of src memory.

The dec_uint16le function
Synopsis


  #include <encdec.h>
  uint16_t dec_uint16le(const unsigned char *src);

Description
Return a 16 bit integer decoded in little endian byte order from 2 bytes of src memory.

The dec_uint32le function
Synopsis


  #include <encdec.h>
  uint32_t dec_uint32le(const unsigned char *src);

Description
Return a 32 bit integer decoded in little endian byte order from 4 bytes of src memory.

The dec_uint64le function
Synopsis


  #include <encdec.h>
  uint64_t dec_uint64le(const unsigned char *src);

Description
Return a 64 bit integer decoded in little endian byte order from 8 bytes of src memory.

1.3. Time functions

These functions may be used to encode a wide variety of low-resolution time encodings.

The enc_time function
Synopsis


  #include <encdec.h>
  size_t enc_time(const time_t *timep, unsigned char *dst, int enc);

Description
Encode the time_t object pointed to by timep into the memory at dst encoded in enc format. The following constants are valid enc parameters.


  Identifier            Units        Epoch Bits Endianess Use case
  ----------------------------------------------------------------
  TIME_1970_SEC_32BE    Seconds      1970  32   big       time_t
  TIME_1970_SEC_32LE    Seconds      1970  32   little    time_t
  TIME_1904_SEC_32BE    Seconds      1904  32   big       MS
  TIME_1904_SEC_32LE    Seconds      1904  32   little    MS
  TIME_1601_NANOS_64BE  Nanoseconds  1601  64   big       MS
  TIME_1601_NANOS_64LE  Nanoseconds  1601  64   little    MS
  TIME_1970_MILLIS_64BE Milliseconds 1970  64   big       Java
  TIME_1970_MILLIS_64LE Milliseconds 1970  64   little    Java

The dec_time function
Synopsis


  #include <encdec.h>
  time_t dec_time(const unsigned char *src, int enc);

Description
Decode a return a time_t object encoded as enc in src. The constants listed in the enc_time description are valid enc parameters for this function as well.

1.4. Floating point numbers

The enc_floatle function
Synopsis


  #include <encdec.h>
  size_t enc_floatle(const float f, unsigned char *dst);

Description
Encode a 32 bit real number f into dst in little endian IEEE754 format and return the number of bytes encoded which is always 4.

The enc_doublele function
Synopsis


  #include <encdec.h>
  size_t enc_doublele(const double d, unsigned char *dst);

Description
Encode a 64 bit real number d into dst in little endian IEEE754 format and return the number of bytes encoded which is always 8.

The enc_floatbe function
Synopsis


  #include <encdec.h>
  size_t enc_floatbe(const float f, unsigned char *dst);

Description
Encode a 32 bit real number f into dst in big endian IEEE754 format and return the number of bytes encoded which is always 4.

The enc_doublebe function
Synopsis


  #include <encdec.h>
  size_t enc_doublebe(const double d, unsigned char *dst);

Description
Encode a 64 bit real number d into dst in big endian IEEE754 format and return the number of bytes encoded which is always 8.

The dec_floatle function
Synopsis


  #include <encdec.h>
  float dec_floatle(const unsigned char *src);

Description
Return a 32 bit real number decoded in little endian IEEE754 format from 4 bytes of src memory.

The dec_doublele function
Synopsis


  #include <encdec.h>
  double dec_doublele(const unsigned char *src);

Description
Return a 64 bit real number decoded in little endian IEEE754 format from 8 bytes of src memory.

The dec_floatbe function
Synopsis


  #include <encdec.h>
  float dec_floatbe(const unsigned char *src);

Description
Return a 32 bit real number decoded in big endian IEEE754 format from 4 bytes of src memory.

The dec_doublebe function
Synopsis


  #include <encdec.h>
  double dec_doublebe(const unsigned char *src);

Description
Return a 64 bit real number decoded in big endian IEEE754 format from 4 bytes of src memory.

1.5. Sting functions

The enc_mbsncpy function
Synopsis


  #include <encdec.h>
  int enc_mbsncpy(const char *src,
           size_t sn,
           char **dst,
           size_t dn,
           int cn,
           const char *tocode);

Description
The enc_mbsncpy function encodes the multi-byte string at src into dst using the tocode encoding identifier. The tocode parameter can be one of the standard encoding identifiers such as "UTF-8", "KOI8-R", "ISO-8859-2", etc. See the libiconv documentation for a complete list:

http://www.gnu.org/software/libiconv/

Specifically the enc_mbsncpy function;

does not read more than sn bytes of src,
does not write to more than dn bytes of dst,
does not convert more than cn characters,
does not convert characters after a '\0' encountered in src,
advances dst by the number of bytes encoded into dst
and returns the number of characters converted

The enc_mbscpy function
Synopsis


  #include <encdec.h>
  int enc_mbscpy(const char *src, char **dst, const char *tocode);

Description
The enc_mbscpy function encodes the multi-byte string at src into dst using the tocode encoding identifier. The conversion stops when a '\0' character is encountered in src. This function is equivalent to enc_mbsncpy(src, INT_MAX, dst, INT_MAX, INT_MAX, tocode). See enc_mbsncpy for details.

The dec_mbsncpy function
Synopsis


  #include <encdec.h>
  size_t dec_mbsncpy(char **src,
           size_t sn,
           char *dst,
           size_t dn,
           int cn,
           const char *fromcode);

Description
The dec_mbsncpy function decodes the string at src encoded as fromcode to the memory at dst as a locale dependent string (possibly UTF-8). The fromcode parameter can be one of the standard encoding identifiers such as "UTF-8", "KOI8-R", "ISO-8859-2", etc. See the libiconv documentation for a complete list:

http://www.gnu.org/software/libiconv/

More specifically the dec_mbsncpy function;

does not read more than sn bytes of src,
does not write to more than dn bytes of dst,
does not convert more than cn characters,
does not convert characters after a '\0' encountered in src,
advances src by the number of bytes decoded,
and returns the number of bytes written to dst unless a '\0' terminator is not encountered in src in which case one is artifically written to dst but not counted in the return value.

Additionally, if dst is NULL this function

does not write to dst,
does not advance the src pointer,
and returns the exact number of bytes required to encode a multi-byte string had dst not been NULL (i.e. for malloc). This includes the '\0' terminator regardless of wheather one was encountered in src.

The dec_mbscpy function
Synopsis


  #include <encdec.h>
  size_t dec_mbscpy(char **src, char *dst, const char *fromcode);

Description
The dec_mbscpy function decodes the string at src encoded as fromcode to the memory at dst as a locale dependent string (possibly UTF-8). The conversion stops when the character '\0' is encountered in src. This function is equivalent to dec_mbsncpy(src, INT_MAX, dst, INT_MAX, INT_MAX, fromcode);. See dec_mbscpy for details.

The dec_mbsndup function
Synopsis


  #include <encdec.h>
  char *dec_mbsndup(char **src,
           size_t sn,
           size_t dn,
           int wn,
           const char *fromcode);

Description
The dec_mbsndup function decodes the string at src encoded as fromcode and returns a locale dependent string (possibly UTF-8) stored in memory allocated with malloc(3). This memory should be freed with free(3) when it will no longer be referenced. This function just calls dec_mbsncpy(src, sn, NULL, dn, wn, fromcode), allocates the precise amount of memory, encodes the string in it with dec_mbsncpy(src, sn, dst, dn, wn, fromcode), and returns the new string.

The dec_mbsdup function
Synopsis


  #include <encdec.h>
  char *dec_mbsdup(char **src, const char *fromcode);

Description
The dec_mbsdup function decodes the string at src encoded as fromcode and returns a locale dependent string (possibly UTF-8) stored in memory allocated with malloc(3). This memory should be freed with free(3) when it will no longer be referenced. This function is equivalent to dec_mbsndup(src, -1, -1, -1, fromcode). See dec_mbsndup for details.