What can you do with PostgreSQL and JSON?

PostgreSQL 9.2 added a native JSON data type, but didn’t add much else. You’ve got three options if you actually want to do something with it:

  1. Wait for PostgreSQL 9.3 (or use the beta)
  2. Use the plv8 extension. Valid option, but more DIY (you’ll have to define your own functions)
  3. Use the json_enhancements extension, which backports the new JSON functionality in 9.3 to 9.2

I wanted to use this stuff now, and I opted to go with option 3. I wrote a blog post which should help you get going if you want to go this route: adding json_enhancements to PostgreSQL 9.2.

So let’s assume you’re on either 9.3, or 9.2 with json_enhancements. What can you do? Lots! All the new JSON operators and functions are in the 9.3 documentation, so I’m going to run through some of the more fun things you can do along with a real-world use case.

Get started

Create a database to play about in:

createdb json_test
psql json_test

With some sample data:

CREATE TABLE books ( id integer, data json ); INSERT INTO books VALUES (1, ‘{ "name": "Book the First", "author": { "first_name": "Bob", "last_name": "White" } }‘); INSERT INTO books VALUES (2, ‘{ "name": "Book the Second", "author": { "first_name": "Charles", "last_name": "Xavier" } }‘); INSERT INTO books VALUES (3, ‘{ "name": "Book the Third", "author": { "first_name": "Jim", "last_name": "Brown" } }‘); 

Selecting

You can use the JSON operators to pull values out of JSON columns:

SELECT id, data->>‘name‘ AS name FROM books; id | name ----+-----------------  1 | Book the First 2 | Book the Second 3 | Book the Third 

The -> operator returns the original JSON type (which might be an object), whereas ->> returns text. You can use the -> to return a nested object and thus chain the operators:

SELECT id, data->‘author‘->>‘first_name‘ as author_first_name FROM books; id | author_first_name ----+-------------------  1 | Bob 2 | Charles 3 | Jim 

How cool is that?

Filtering

Of course, you can also select rows based on a value inside your JSON:

SELECT * FROM books WHERE data->>‘name‘ = ‘Book the First‘; id | data ----+---------------------------------------------------------------------------------------  1 | ‘{ "name": "Book the First", "author": { "first_name": "Bob", "last_name": "White" } }‘ 

You can also find rows based on the value of a nested JSON object:

SELECT * FROM books WHERE data->‘author‘->>‘first_name‘ = ‘Charles‘; id | data ----+---------------------------------------------------------------------------------------------  2 | ‘{ "name": "Book the Second", "author": { "first_name": "Charles", "last_name": "Xavier" } }‘ 

Indexing

You can add indexes on any of these using PostgreSQL’s expression indexes, which means you can even add unique constraints based on your nested JSON data:

CREATE UNIQUE INDEX books_author_first_name ON books ((data->‘author‘->>‘first_name‘)); INSERT INTO books VALUES (4, ‘{ "name": "Book the Fourth", "author": { "first_name": "Charles", "last_name": "Davis" } }‘); ERROR: duplicate key value violates unique constraint "books_author_first_name" DETAIL: Key (((data -> ‘author‘::text) ->> ‘first_name‘::text))=(Charles) already exists. 

Expression indexes are somewhat expensive to create, but once in place will make querying on any JSON property very fast.

A real world example

OK, let’s give this a go with a real life use case. Let’s say we’re tracking analytics, so we have an events table:

CREATE TABLE events ( name varchar(200), visitor_id varchar(200), properties json, browser json ); 

We’re going to store events in this table, like pageviews. Each event has properties, which could be anything (e.g. current page) and also sends information about the browser (like OS, screen resolution, etc). Both of these are completely free form and could change over time (as we think of extra stuff to track).

Let’s insert a couple of events:

INSERT INTO events VALUES ( ‘pageview‘, ‘1‘, ‘{ "page": "/" }‘, ‘{ "name": "Chrome", "os": "Mac", "resolution": { "x": 1440, "y": 900 } }‘ ); INSERT INTO events VALUES ( ‘pageview‘, ‘2‘, ‘{ "page": "/" }‘, ‘{ "name": "Firefox", "os": "Windows", "resolution": { "x": 1920, "y": 1200 } }‘ ); INSERT INTO events VALUES ( ‘pageview‘, ‘1‘, ‘{ "page": "/account" }‘, ‘{ "name": "Chrome", "os": "Mac", "resolution": { "x": 1440, "y": 900 } }‘ ); INSERT INTO events VALUES ( ‘purchase‘, ‘5‘, ‘{ "amount": 10 }‘, ‘{ "name": "Firefox", "os": "Windows", "resolution": { "x": 1024, "y": 768 } }‘ ); INSERT INTO events VALUES ( ‘purchase‘, ‘15‘, ‘{ "amount": 200 }‘, ‘{ "name": "Firefox", "os": "Windows", "resolution": { "x": 1280, "y": 800 } }‘ ); INSERT INTO events VALUES ( ‘purchase‘, ‘15‘, ‘{ "amount": 500 }‘, ‘{ "name": "Firefox", "os": "Windows", "resolution": { "x": 1280, "y": 800 } }‘ ); 

Hm, this is starting to remind me of MongoDB!

Collect some stats

Using the JSON operators, combined with traditional PostgreSQL aggregate functions, we can pull out whatever we want. You have the full might of an RDBMS at your disposal.

Browser usage?

SELECT browser->>‘name‘ AS browser, count(browser) FROM events GROUP BY browser->>‘name‘; browser | count ---------+-------  Firefox | 3 Chrome | 2 

Total revenue per visitor?

SELECT visitor_id, SUM(CAST(properties->>‘amount‘ AS integer)) AS total FROM events WHERE CAST(properties->>‘amount‘ AS integer) > 0 GROUP BY visitor_id; visitor_id | total ------------+-------  5 | 10 15 | 700 

Average screen resolution?

SELECT AVG(CAST(browser->‘resolution‘->>‘x‘ AS integer)) AS width, AVG(CAST(browser->‘resolution‘->>‘y‘ AS integer)) AS height FROM events; width | height -----------------------+----------------------  1397.3333333333333333 | 894.6666666666666667 

You’ve probably got the idea, so I’ll leave it here.

时间: 2024-08-07 21:26:29

What can you do with PostgreSQL and JSON?的相关文章

PostgreSQL是不是你的下一个JSON数据库?

根据Betteridge定律(任何头条的设问句可以用一个词来回答:不是),除非你的JSON数据很少修改,并且查询很多. 最新版的PostgreSQL添加更多对JSON的支持,我们曾经问过PostgreSQL是否可以替换MongoDB作为JSON数据库,答案显而易见,但我们更希望的是,啊哈,这个问题由读者来问了. "PostgreSQL不是已经有一些json的支持了吗?" 是的,在PostgreSQL 9.4之前的版本也有JSON 数据类型了,你可以这样: CREATE TABLE ju

配置ogg从Oracle到PostgreSQL的同步复制json数据

标签:goldengate postgresql oracle json 测试环境说明 Oracle:Windows 8.1 + Oracle 12.2.0.1.0 + GoldenGate 12.3.0.1.2 for oracle IP:10.155.4.150 PostgreSQL:CentOS7 + postgresql 10.10-1 + Goldengate 12.2.0.1 for PostgreSQL IP: 10.155.5.178 源端 (因为Oracle的数据库和OGG安装

PostgreSQL异步客户端(并模拟redis 数据结构)

以前为了不在游戏逻辑(对象属性)变更时修改数据库,就弄了个varchar字段来表示json,由服务器逻辑(读取到内存)去操作它. 但这对运维相当不友好,也不能做一些此Json数据里查询. 所以后面就用了下ssdb,然而就在前几天才了解到postgresql支持json了(其实早在两年前就行了吧···) 就这点差不多就可以算当作mongodb用了,不过还是不支持redis的高级数据结构. 于是我就想模拟(实现)下redis(的数据结构). 就抽空看了下它的c api库:libpq,发现其请求-等待

PostgreSql性能测试

# PostgreSql性能测试 ## 1. 环境+ 版本:9.4.9+ 系统:OS X 10.11.5+ CPU:Core i5 2.7G+ 内存:16G+ 硬盘:256G SSD ## 2. 测试情况 ### 2.1 测试表结构 ```sql/* Navicat Premium Data Transfer Source Server : postgresql Source Server Type : PostgreSQL Source Server Version : 90409 Sourc

PostgreSQL 优势,MySQL 数据库自身的特性并不十分丰富,触发器和存储过程的支持较弱,Greenplum、AWS 的 Redshift 等都是基于 PostgreSQL 开发的

PostgreSQL 优势 2016-10-20 21:36 686人阅读 评论(0) 收藏 举报  分类: MYSQL数据库(5)  PostgreSQL 是一个自由的对象-关系数据库服务器(数据库管理系统),功能很强大.包括了可以说是目前世界上最丰富的数据类型的支持,比如 IP 类型和几何类型等等. 发现很多读者都问过这样一个问题:如果打算为项目选择一款免费.开源的数据库,那么你可能会在MySQL与PostgreSQL之间犹豫不定.针对这个问题,我们采访到了即将在Postgres中国用户20

Deepgreen DB简介(转)

原文链接 Deepgreen DB 全称 Vitesse Deepgreen DB,它是一个可扩展的大规模并行(通常称为MPP)数据仓库解决方案,起源于开源数据仓库项目Greenplum DB(通常称为GP或GPDB).所以已经熟悉了GP的朋友,可以无缝切换到Deepgreen. 它几乎拥有GP的所有功能,在保有GP所有优势的基础上,Deepgreen对原查询处理引擎进行了优化,新一代查询处理引擎扩展了: 优越的连接和聚合算法 新的溢出处理子系统 基于JIT的查询优化.矢量扫描和数据路径优化 下

EntityFramework Core 学习扫盲

0. 写在前面 1. 建立运行环境 2. 添加实体和映射数据库 1. 准备工作 2. Data Annotations 3. Fluent Api 3. 包含和排除实体类型 1. Data Annotations [NotMapped] 排除实体和属性 2. Fluent API [Ignore] 排除实体和属性 4. 列名称和类型映射 1. Data Annotations 5. 主键 1. Data Annotations [Key] 2. Fluent API [HasKey] 6. 备用

PostgreSQL Json字段作为查询条件案例

业务扩展字段在数据库中经常会使用json格式的数据来存储,这就涉及到一个头疼的问题,假设要使用扩展字段里的某个值作为查询条件怎么办,原来PostgreSQL本身就支持这种查询方式. 例子:假设业务扩展字段ext_data存的json格式如下: 我们需要查询扩展字段中extInfo.userType=1的所有数据,那么对应的SQL语句如下: select * from event_log where (ext_data::json#>>'{extInfo,userType}')::text =

postgresql 直接生成 dhtmlxgrid 可以接受的JSON串

前台: dhtmlxgrid.显示数据   其格式为: { rows:[ {id:1,data:[1,2,3]} ,{} ]} 如果在postgesql里直接生成这样的串呢?? 这是就今天要做的事. 也是测试了一天,还是别人的帮助下完成: 1,2  本人自己写的,怎么也达不成这个目标: 3,    群友给的方案:完美解决 --方案1: select json_agg(row_to_json(t))::text from (select id,concat_ws(',',pt_name,pt_de