From 91484409bdd17f330d10671d388b72d4ef1451d7 Mon Sep 17 00:00:00 2001 From: Fujii Masao Date: Sun, 8 Dec 2013 02:06:02 +0900 Subject: Expose qurey ID in pg_stat_statements view. The query ID is the internal hash identifier of the statement, and was not available in pg_stat_statements view so far. Daniel Farina, Sameer Thakur and Peter Geoghegan, reviewed by me. --- doc/src/sgml/pgstatstatements.sgml | 75 ++++++++++++++++++++++++++++++-------- 1 file changed, 60 insertions(+), 15 deletions(-) (limited to 'doc/src') diff --git a/doc/src/sgml/pgstatstatements.sgml b/doc/src/sgml/pgstatstatements.sgml index c02fdf44833..c607710ccda 100644 --- a/doc/src/sgml/pgstatstatements.sgml +++ b/doc/src/sgml/pgstatstatements.sgml @@ -23,11 +23,12 @@ The <structname>pg_stat_statements</structname> View - The statistics gathered by the module are made available via a system view - named pg_stat_statements. This view contains one row for - each distinct query, database ID, and user ID (up to the maximum - number of distinct statements that the module can track). The columns - of the view are shown in . + The statistics gathered by the module are made available via a + system view named pg_stat_statements. This view + contains one row for each distinct database ID, user ID and query + ID (up to the maximum number of distinct statements that the module + can track). The columns of the view are shown in + . @@ -57,7 +58,14 @@ OID of database in which the statement was executed - + + queryid + bigint + + Internal hash identifier, computed from the entry's post-parse-analysis tree + + + query text @@ -189,9 +197,10 @@ - For security reasons, non-superusers are not allowed to see the text of - queries executed by other users. They can see the statistics, however, - if the view has been installed in their database. + For security reasons, non-superusers are not allowed to see the SQL + text or queryid of queries executed by other users. They can see + the statistics, however, if the view has been installed in their + database. @@ -209,8 +218,9 @@ When a constant's value has been ignored for purposes of matching the query to other queries, the constant is replaced by ? in the pg_stat_statements display. The rest of the query - text is that of the first query that had the particular hash value - associated with the pg_stat_statements entry. + text is that of the first query that had the particular + queryid hash value associated with the + pg_stat_statements entry. @@ -223,10 +233,45 @@ - Since the hash value is computed on the post-parse-analysis representation - of the queries, the opposite is also possible: queries with identical texts - might appear as separate entries, if they have different meanings as a - result of factors such as different search_path settings. + Since the queryid hash value is computed on the + post-parse-analysis representation of the queries, the opposite is + also possible: queries with identical texts might appear as + separate entries, if they have different meanings as a result of + factors such as different search_path settings. + + + + Consumers of pg_stat_statements may wish to use + queryid (perhaps in composite with + dbid and userid) as a more stable + and reliable identifier for each entry than its query text. + However, it is important to understand that there are only limited + guarantees around the stability of the queryid hash + value. Since the identifier is derived from the + post-parse-analysis tree, its value is a function of, among other + things, the internal identifiers that comprise this representation. + This has some counterintuitive implications. For example, a query + against a table that is fingerprinted by + pg_stat_statements will appear distinct to a + subsequently executed query that a reasonable observer might judge + to be a non-distinct, if in the interim the table was dropped and + re-created. The hashing process is sensitive to difference in + machine architecture and other facets of the platform. + Furthermore, it is not safe to assume that queryid + will be stable across major versions of PostgreSQL. + + + + As a rule of thumb, an assumption of the stability or comparability + of querid values should be predicated on the the + underlying catalog metadata and hash function implementation + details exactly matching. Any two servers participating in any + variety of replication based on physical WAL-replay can be expected + to have identical querid values for the same query. + Logical replication schemes do not have replicas comparable in all + relevant regards, and so querid will not be a + useful identifier for accumulating costs for the entire replica + set. If in doubt, direct testing is recommended. -- cgit v1.2.3