MapR Database JSON Tables

JSON documents are stored in MapR Database JSON tables. MapR Database supports schema flexibility in the documents and provides the tools to efficiently access them. It optimizes the storage of the JSON documents, providing high performance.

When a JSON document is added to a JSON table, it is put in a table row. The table row is part of one column family (although you can create more, as described in Column Families in JSON Tables). The value in the row is a single JSON document that is stored in a binary format. The binary format allows MapR Database to make a number of optimizations to the document’s layout to make data access fast and efficient. MapR Database also maintains the data types associated with fields in a JSON document.

The JSON documents in a table need not have identical structures. It is possible to include in a table any number of JSON documents that have no common fields or share only a subset of fields.

For example, an online retailer might have the following three documents in a single JSON table. Only a subset of fields is common to all three documents. These are key differences:

Each document has a different nested document in a field named specifications.
Only two of the documents have arrays in the field features.
The retailers field has different types in the first and third documents.

{
    "_id" : "ID1",
    "product_ID" : "4GGC859",
    "name" : "Thresher 1000",
    "brand" : "Careen",
    "category" : "Bicycle",
    "type" : "Road bicycle",
    "price" : 2949.99,

    "specifications" : {
        "size" : "55cm",
        "wheel_size" : "700c",
        "frameset" : {
            "frame" : "Carbon Enduro",
            "fork" : "Gabel 2"
        },
        "groupset" : {
            "chainset" : "Kette 230",
            "brake" : "Bremse FullStop"
	  },
        "wheelset" : {
	        "wheels" : "Rad Schnell 10",
	        "tyres" : "Reifen Pro"
        }
    },

    "retailers": {
        "name" : "Eden Bicycles",
        "location" : {
            "city" : "Castro Valley",
            "state" : "CA"
        }
    }
}

{
    "_id" : "ID2",
    "product_ID" : "2DT3201",
    "name" : " Allegro SPD-SL 6800",
    "brand" : "Careen",
    "category" : "Pedals",
    "type" : "Components",
    "price" : 112.99,

    "features" : [
        "Low-profile design",
        "Floating SH11 cleats included"
    ],

    "specifications" : {
        "weight_per_pair" : "260g",
        "color" : "black"
    }
}

{
    "_id" : "ID3",
    "product_ID" : "3ML6758",
    "name" : "Trikot 24-LK",
    "brand" : "Careen",
    "category" : "Jersey",
    "type" : "Clothing",
    "price" : 76.99,

    "features" : [
        "Wicks away moisture.",
        "SPF-30",
        "Reflects light at night."
    ],

    "specifications" : {
        "sizes" : ["S","M","L","XL","XXL"],
        "colors" : [
            "white",
            "navy",
            "green"
        ]
    },

    "retailers" : [
        {
            "name" : "Bespoke Cycles",
            "city": "San Francisco",
            "state" : "CA"
        },
        {
            "name" : "Trek Bicycle",
            "city" : "New York",
            "state" : "NY"
        }
    ]       
}

Container Syntax

Starting in MapR Database 6.1, even though the retailers field is an array of nested documents in document 1 and a nested document in document 3, you can reference subfields of the nested documents in both documents using the following container syntax:

retailers[].name

Specifying that field reference returns the following for the three documents:

{
    "retailers":{"name":"Eden Bicycles"}
}
{}
{
    "retailers":[
        {"name":"Bespoke Cycles"},
        {"name":"Trek Bicycle"}]
}

NOTE An empty document is returned for the second document because that document does not have a retailers field.

See Container Field Paths for more information.

Table Paths

Tables are stored in the MapR filesystem. When providing the path to a table in MapR tools and APIs, use these conventions:

For a path on the local cluster, start the path at the volume mount point. For example, for a table named test under a volume with a mount point at /volume1, specify the following path: /volume1/test
For a path on a remote cluster, you must also specify the cluster name in the path. For example, for a table named customer in volume1 in the sanfrancisco cluster, specify the following path: /mapr/sanfrancisco/volume1/customer

Tools for Creating and Administering JSON Tables

These are the tools available for creating and administering JSON tables in MapR Database:

MapR Database Shell: This shell is a light-weight tool for manipulating JSON tables and documents. Learn more about it at MapR Database Shell (JSON Tables).
MapR Database JSON Client API: This API allows you to manage MapR Database JSON tables. The API includes methods to create, alter, and drop tables and column families. Learn more about these APIs at Managing JSON Tables.
Python OJAI Client: This API allows you to create and drop MapR Database JSON tables in Python. Learn more about it at Using the Python OJAI Client.
MapR Database JSON REST API: The REST API allows you to create and drop MapR Database JSON tables using HTTP calls. Learn more about it at Using the MapR Database JSON REST API.
MapR Database JSON utilities: MapR Database JSON supports several utilities for loading tables. Learn more about these utilities at Loading Documents into JSON Tables.
maprcli commands: The maprcli table commands fully support JSON tables. See table.

NOTE For a list of tools available to query and manage documents in MapR Database JSON tables, see Tools for Working with JSON Documents.