Saturday, September 13, 2014

pg_dbms_stats - Fixing relation and column statistics and planning, and managing stable query performance - PostgreSQL

pg_dbms_stats, which is a little popular module in Japan, fix relation(it means table or index) and column statistics, and then make stable query planning. This article mainly shows how to operate it with simple scenario as follows. If you want to know basically usage (i.g., installation, creation of extension), please check out the official manual.
  1. Lock relation and column statistics
  2. Confirmation of state of locked statistics
  3. Confirmation of statistics used by planner
1. Lock relation and column statistics
dbms_stats.lock() function with one argument locks both relation and column statistics. Also we can specify two (schema, relation), or three(schema, relation, column) arguments to this function.
=# SELECT relname, relpages, reltuples, relallvisible FROM pg_class WHERE relname = 'hoge';
+-----------+------------+-------------+---------------+
| relname | relpages | reltuples | relallvisible |
+-----------+------------+-------------+---------------+
| hoge      |      1       |  100       |             1    |
+-----------+------------+-------------+---------------+
(1 row)
=# SELECT dbms_stats.lock('hoge');
+------+
| lock |
+------+
| hoge |
+------+
(1 row)
2. Confirmation of state of locked statistics
The each locked statistics are stored into dbms_stats._relation_stats_locked or dbms_stats._column_stats_locked.
=# SELECT * FROM dbms_stats._relation_stats_locked;
+-[ RECORD 1 ]-----+-------------------------------------------------+
| relid                      | 16384                                                |
| relname                | public.hoge                                         |
| relpages               | 1                                                         |
| reltuples               | 100                                                     |
| relallvisible           | 1                                                         |
| curpages              | 1                                                         |
| last_analyze          |                                                           |
| last_autoanalyze   | 2014-08-18 22:35:13.744151+09 |
+------------------+-------------------------------------------------------+
=# SELECT * FROM dbms_stats._column_stats_locked;
+-[ RECORD 1 ]+-------+
| starelid        | 16384 |
| staattnum   | 1     |
| stainherit    |  f     |
| stanullfrac  | 0     |
| stawidth     | 4     |
| stadistinct  | 1     |
| stakind1     | 1     |
| stakind2     | 3     |
| stakind3     | 0     |
| stakind4     | 0     |
| stakind5     | 0     |
| staop1       | 96    |
| staop2       | 97    |
| staop3       | 0     |
| staop4       | 0     |
| staop5       | 0     |
| stanumbers1 | {1}   |
| stanumbers2 | {1}   |
| stanumbers3 |       |
| stanumbers4 |       |
| stanumbers5 |       |
| stavalues1  | {0}   |
| stavalues2  |       |
| stavalues3  |       |
| stavalues4  |       |
| stavalues5  |       |
+-------------+-------+

3. Confirmation of statistics used by planner
The planner use both dbms_stats.column_stats_effective and dbms_stats.relation_stats_effective tables actually. These tables are merged two tables(original one and locked one) tables. i.g., The locked statistics are used preferentially if there are, and then original statistics are used for rest of unlocked statistics.

You might want to say after reading this article, it is possible same operation with disable analyzing  and autovacuum. Note that pg_dbms_stats has also actually physical size, which is used for planning, into dbms_stats.relation_stats_locked. So planning will be stable.

Monday, September 8, 2014

Customize psql prompting format - PostgreSQL

I introduce about how to display line number on psql command, and how to customize them here. Actually this function needs latest version of PostgreSQL (from 9.5).
Here is development version manual.

The manual says,
%l
The line number inside the current statement, starting from 1.
We can set variable settings to psql output style, output format, prompt characters and so on. One of them is showing line number on psql.
You guys may used to it for now, it's really simple as default.
e.g., we can set prompt formatting and sample operation as follows.
postgres=# \set PROMPT1 '[%/][%>](%l)%R%# '
[postgres][55432](1)=# \set PROMPT2 '[%/][%>](%l)%R%# '
[postgres][55432](1)=# SELECT
[postgres][55432](2)=# aaaaaa -- error position
[postgres][55432](3)=# FROM
[postgres][55432](4)=# hoge; 
ERROR:  column "aaaaaa" does not exist
LINE 2: aaaaaa
        ^ 
How is this?
The prompt character means database name(%/), port number,(%>) line number (%l)from left. Please see describe them in manual.

If you feel waste to set prompting characters whenever execution of psql then psql read setting file (~/.psqlrc) before lunching itself. So if we write them to setting file then we don't need to set manually each time. Please try that. The sample .psqlrc file is following.
$ cat ~/.psqlrc
\pset border 2
\set PROMPT2 '[%/][%>](%l)%R%# '
\set PROMPT1 '[%/][%>](%l)%R%# '