Consuming CDC Records

The OJAI changelog interfaces are used to consume changed data records (propagated by the Change Data Capture feature).

The general CDC flow of understanding architectural concepts, performing administrative tasks to set up and use CDC, performing CRUD operations on a database table, and developing applications for consuming CDC changed data records. This diagram provides hotspot links to help you navigate to the applicable documentation.

Learning about CDCAdministering Change Data CaptureBuilding a consumer app for CDCUsing dbshell to perform CRUD operations on HPE Ezmeral Data Fabric Database JSON tablesDeveloping client applications for HPE Ezmeral Data Fabric Database JSON tables.Using hbshell to perform CRUD operations on HPE Ezmeral Data Fabric Database binary tables.Developing client applications for HPE Ezmeral Data Fabric Database binary tables.

Javadoc

See the following Java documentation for detailed information about CDC APIs.

Java OJAI CDC API

Deserializer for consuming CDC records

The deserializer converts stream messages into individual change data records. When your application creates a CDC consumer, you must also register the ChangeData deserializer by setting the value.deserializer configuration parameter to com.mapr.db.cdc.ChangeDataRecordDeserializer.
NOTE When applications consume from a CDC change topic, the record key retrieved from poll() is not deserialized. The record key is not equal to the _id field of the document. If you want to retrieve the exact _id of the document, you must call the ChangeDataRecord.getId() method.

Interfaces for working with CDC records

The following OJAI interfaces and enumerations create consumers for CDC changed data.

ChangeNode
Contains the change to a single field in a document.
ChangeEvent
Identifies the change event associated with the current change node. The value of ChangeEvent can be one of the following:
  • NULL (no event)
  • NODE (a change with real value)
  • START_MAP (a node representing the beginning of a map)
  • END_MAP (a node representing the end of a map)
  • START_ARRAY (a node representing the beginning of an array)
  • END_ARRAY (a node representing the end of an array)
ChangeOp
Identifies the type of the operation performed on the current field. The values of ChangeOp can be one of the following:
  • NULL (no operation)
  • SET (replace the current field with the given value)
  • PUT (add an extra version of the value)
  • MERGE (combine the given value with the existing values in the table)
  • DELETE (delete all values older than or equal to the delete operation timestamp)
  • DELETE_EXACT (delete the version of the value with the given timestamp)
ChangeDataRecord
Contains all the changes made on a single document/row in the source table.
ChangeDataRecordType
Specifies the mode of change for the change data record. The following values are specified:
  • RECORD_INSERT
  • RECORD_UPDATE
  • RECORD_DELETE
ChangeDataReader
Is a parser that traverses over the individual change tree nodes on a change data record. It provides cursor-like semantics that can be moved, one tree node at a time, by invoking the next method. The APIs retrieve the properties of individual change nodes (for example: data type, field name, field value, and so on).

Open Data Format

The CDC Open Format feature allows you to create applications in languages other then Java that consume CDC (Change Data Capture) changed data records. For example, C/C++, Python, and C#/.NET) are supported.

This functionality is provided with an open format decoder/serializer in the HPE Ezmeral Data Fabric Streams C library. The decoder translates the internal format to the open data format, decodes/deserializes the data, and returns the value of the changed data record as a human readable JSON string.

All languages that are binding through the HPE Ezmeral Data Fabric Streams C library can retrieve the open data format and, with a simple JSON parser, consume changed data records.