Snippet for Fun

Read/Query JSON Value with SQLAlchemy

Posted at — Aug 7, 2019

Background

Nowadays JSON is well supported in RDBMS like MySQL and Postgresql. We can have all benefit by using a single RDBMS. For example, it’s easy to design a table schema with mixing structured data and schema-less/dynamic schema data. This snippet will show you how to read & query JSON field value with SQLAlchemy from MySQL.

Example

Suppose we have an e-commerce site with a database table called product. To store both product data and product attributes in the same table, we can design a schema like this (some fields omitted for brevity):

Querying from MySQL

To query a JSON field, we can use the JSON_EXTRACT to get one field value:

select
  id, JSON_EXTRACT(attributes, '$.color') as product_color
from product;
+------+-----------------+
| id   | product_color   |
|------+-----------------|
| 1    | "red"           |
| 2    | "yellow"        |
| 3    | "yellow"        |
+------+-----------------+

But wait a second, there are quotes in the string value of product_color. To remove them, query with JSON_UNQUOTE:

select
  id, JSON_UNQUOTE(JSON_EXTRACT(attributes, '$.color')) as product_color
from product;
+------+-----------------+
| id   | product_color   |
|------+-----------------|
| 1    | red             |
| 2    | yellow          |
| 3    | yellow          |
+------+-----------------+

Reading from SQLAlchemy

SQLAlchemy provides us a way to define RDBMS models in declarative style:

import sqlalchemy as sa
from sqlalchemy.dialects import mysql
from sqlalchemy.ext.declarative import declarative_base


Base = declarative_base()


class Product(Base):

    __tablename__ = 'product'

    id = sa.Column(sa.Integer, primary_key=True)
    sku = sa.Column(sa.String(128), unique=True)
    attributes = sa.Column(mysql.MEDIUMTEXT)

But these fields are one-to-one mapped to the table column. For the JSON field, we can make use of the hybrid_property extension:

import json
from typing import Optional

import sqlalchemy as sa
from sqlalchemy import func as sa_func
from sqlalchemy.dialects import mysql
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy.ext.hybrid import hybrid_property


class Product(Base):

    __tablename__ = 'product'

    id = sa.Column(sa.Integer, primary_key=True)
    sku = sa.Column(sa.String(128), unique=True)
    _attributes = sa.Column('attributes', mysql.MEDIUMTEXT)

    @hybrid_property
    def attributes(self) -> dict:
        """Hybrid property for reading product attributes JSON dict.

        It always returns a dict.
        """
        if self._attributes is None:
            # null value from db side, return an empty dict
            return {}
        try:
            return json.loads(self._attributes)
        except Exception:
            # parse failed, return an empty dict
            return {}

    @hybrid_property
    def color(self) -> Optional[str]:
        """Hybrid property for reading color value from attributes JSON dict.
        """
        return self.attributes.get('color')

    @hybrid_property
    def size(self) -> Optional[str]:
        """Hybrid property for reading size value from attributes JSON dict.
        """
        return self.attributes.get('size')

Then, we can read the JSON field as a normal python object property:

>>> Session().query(Product).first().color
'red'

Querying from SQLAlchemy

So come to the next question, how can we query the JSON field likes we do in the SQL? Don’t worry, the powerful SQLAlchemy can solve this problem easily:

# ... import omitted

class Product(Base):

    __tablename__ = 'product'

    id = sa.Column(sa.Integer, primary_key=True)
    sku = sa.Column(sa.String(128), unique=True)
    _attributes = sa.Column('attributes', mysql.MEDIUMTEXT)

    @hybrid_property
    def attributes(self) -> dict:
        """Hybrid property for reading product attributes JSON dict.

        It always returns a dict.
        """
        if self._attributes is None:
            # null value from db side, return an empty dict
            return {}
        try:
            return json.loads(self._attributes)
        except Exception:
            # parse failed, return an empty dict
            return {}

    @hybrid_property
    def color(self) -> Optional[str]:
        """Hybrid property for reading color value from attributes JSON dict.
        """
        return self.attributes.get('color')

    @color.expression
    def color(cls):
        """SQL expression for querying color value in mysql."""
        field = sa_func.json_extract(cls._attributes, '$.color')
        return sa_func.json_unquote(field)

    @hybrid_property
    def size(self) -> Optional[str]:
        """Hybrid property for reading size value from attributes JSON dict.
        """
        return self.attributes.get('size')

    @size.expression
    def size(cls):
        """SQL expression for querying size value in mysql."""
        field = sa_func.json_extract(cls._attributes, '$.size')
        return sa_func.json_unquote(field)

For the color/size property, we will define two extra expressions methods, which will be compiled as part of the SQL during the querying execution. We can also integrate some much more complex expressions within this method.

Conclusion

In MySQL, we can convert a string column into JSON and query data with the JSON_EXTRACT function. Meanwhile, SQLAlchemy equips us with the powerful hybrid property extension to handle reading/querying from schema-less columns use case.

A full example of code can be check out from the GitHub repo.