aboutsummaryrefslogtreecommitdiff
path: root/doc/src/sgml/trouble.sgml
blob: 776e96f716345fffe152fbae199be88111861aef (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
 <Chapter Id="trouble">
  <Title>Troubleshooting</Title>

  <sect1>
   <title>Postmaster Startup Failures</title>

   <para>
    There are several common reasons for the postmaster to fail to start up.
    Check the postmaster's log file, or start it by hand (without redirecting
    standard output or standard error) to see what complaint messages appear.
    Some of the possible error messages are reasonably self-explanatory,
    but here are some that are not:
   </para>

   <para>
    <ProgramListing>
FATAL: StreamServerPort: bind() failed: Address already in use
        Is another postmaster already running on that port?
    </ProgramListing>
    This usually means just what it suggests: you accidentally started a
    second postmaster on the same port where one is already running.
    However, if the kernel error
    message is not "Address already in use" or some variant of that wording,
    there may be a different problem.  For example, trying to start a
    postmaster on a reserved port number may draw something like
    <ProgramListing>
$ postmaster -i -p 666
FATAL: StreamServerPort: bind() failed: Permission denied
        Is another postmaster already running on that port?
    </ProgramListing>
   </para>

   <para>
    <ProgramListing>
IpcMemoryCreate: shmget failed (Invalid argument) key=5440001, size=83918612, permission=600
FATAL 1:  ShmemCreate: cannot create region
    </ProgramListing>
    A message like this probably means that your kernel's limit on the size
    of shared memory areas is smaller than the buffer area that Postgres
    is trying to create.  (Or it could mean that you don't have SysV-style
    shared memory support configured into your kernel at all.)  As a temporary
    workaround, you can try starting the postmaster with a smaller-than-normal
    number of buffers (-B switch).  You will eventually want to reconfigure
    your kernel to increase the allowed shared memory size, however.
    You may see this message when trying to start multiple postmasters on
    the same machine, if their total space requests exceed the kernel limit.
   </para>

   <para>
    <ProgramListing>
IpcSemaphoreCreate: semget failed (No space left on device) key=5440026, num=16, permission=600
    </ProgramListing>
    A message like this does <emphasis>not</emphasis> mean that you've run out
    of disk space; it means that your kernel's limit on the number of SysV
    semaphores is smaller than the number Postgres wants to create.  As above,
    you may be able to work around the problem by starting the postmaster with
    a reduced number of backend processes (-N switch), but you'll eventually
    want to increase the kernel limit.
   </para>

  </sect1>

  <sect1>
   <title>Client Connection Problems</title>

   <para>
    Once you have a running postmaster, trying to connect to it with
    client applications can fail for a variety of reasons.  The sample
    error messages shown here are for clients based on recent versions
    of libpq --- clients based on other interface libraries may produce
    other messages with more or less information.
   </para>

   <para>
    <ProgramListing>
connectDB() -- connect() failed: Connection refused
Is the postmaster running (with -i) at 'server.joe.com' and accepting connections on TCP/IP port '5432'?
    </ProgramListing>
    This is the generic "I couldn't find a postmaster to talk to" failure.
    It looks like the above when TCP/IP communication is attempted, or like
    this when attempting Unix-socket communication to a local postmaster:
    <ProgramListing>
connectDB() -- connect() failed: No such file or directory
Is the postmaster running at 'localhost' and accepting connections on Unix socket '5432'?
    </ProgramListing>
    The last line is useful in verifying that the client is trying to connect
    where it is supposed to.  If there is in fact no postmaster
    running there, the kernel error message will typically be either
    "Connection refused" or "No such file or directory", as illustrated.
    (It is particularly important to realize that "Connection refused" in
    this context does <emphasis>not</emphasis> mean that the postmaster
    got your connection request and rejected it --- that case will produce
    a different message, as shown below.)
    Other error messages such as "Connection timed out" may indicate more
    fundamental problems, like lack of network connectivity.
   </para>

   <para>
    <ProgramListing>
No pg_hba.conf entry for host 123.123.123.123, user joeblow, database testdb
    </ProgramListing>
    This is what you are most likely to get if you succeed in contacting
    a postmaster, but it doesn't want to talk to you.  As the message
    suggests, the postmaster refused the connection request because it
    found no authorizing entry in its pg_hba.conf configuration file.
   </para>

   <para>
    <ProgramListing>
Password authentication failed for user 'joeblow'
    </ProgramListing>
    Messages like this indicate that you contacted the postmaster, and it's
    willing to talk to you, but not until you pass the authorization method
    specified in the pg_hba.conf file.  Check the password you're providing,
    or check your Kerberos or IDENT software if the complaint mentions
    one of those authentication types.
   </para>

   <para>
    <ProgramListing>
FATAL 1:  SetUserId: user 'joeblow' is not in 'pg_shadow'
    </ProgramListing>
    This is another variant of authentication failure: no Postgres create_user
    command has been executed for the given username.
   </para>

   <para>
    <ProgramListing>
FATAL 1:  Database testdb does not exist in pg_database
    </ProgramListing>
    There's no database by that name under the control of this postmaster.
    Note that if you don't specify a database name, it defaults to your
    Postgres username, which may or may not be the right thing.
   </para>

  </sect1>

  <sect1>
   <title>Debugging Messages</title>

   <para>
    The <Application>postmaster</Application> occasionally prints out  
    messages  which
    are  often helpful during troubleshooting.  If you wish
    to view debugging messages from the <Application>postmaster</Application>, 
    you can
    start  it with the -d option and redirect the output to
    the log file:

    <ProgramListing>
% postmaster -d > pm.log 2>&1 &
    </ProgramListing>

    If you do not wish to see these messages, you can type
    <ProgramListing>
% postmaster -S
    </ProgramListing>
    and the <Application>postmaster</Application> will be "S"ilent.  
    Notice that there
    is no ampersand ("&amp") at the end of the last example so
    postmaster will be running in the foreground.
   </Para>

   <sect2 Id="pg-options-trouble">
    <Title>pg_options</Title>

    <Para>
     <Note>
      <Para>
       Contributed by <ULink url="mailto:dz@cs.unitn.it">Massimo Dal Zotto</ULink>
      </Para>
     </Note>
    </para>
    <Para>
     The optional file <filename>data/pg_options</filename> contains runtime
     options used by the backend to control trace messages and other backend
     tunable parameters.
     What makes this file interesting is the fact that it is re-read by a backend
     when it receives a SIGHUP signal, making thus possible to change run-time
     options on the fly without needing to restart 
     <productname>Postgres</productname>.
     The options specified in this file may be debugging flags used by the trace
     package (<filename>backend/utils/misc/trace.c</filename>) or numeric
     parameters which can be used by the backend to control its behaviour.
     New options and parameters must be defined in
     <filename>backend/utils/misc/trace.c</filename> and
     <filename>backend/include/utils/trace.h</filename>.
    </para>

   <para>
    pg_options can also be specified with the <option>-T</option> switch of 
    <productname>Postgres</productname>:

    <programlisting>
postgres <replaceable>options</replaceable> -T "verbose=2,query,hostlookup-"
    </programlisting>
   </para>

   <Para>
    The functions used for printing errors and debug messages can now make use
    of the <citetitle>syslog(2)</citetitle> facility. Message printed to stdout
    or stderr are prefixed by a timestamp containing also the backend pid:

    <programlisting>
#timestamp          #pid    #message
980127.17:52:14.173 [29271] StartTransactionCommand
980127.17:52:14.174 [29271] ProcessUtility: drop table t;
980127.17:52:14.186 [29271] SIIncNumEntries: table is 70% full
980127.17:52:14.186 [29286] Async_NotifyHandler
980127.17:52:14.186 [29286] Waking up sleeping backend process
980127.19:52:14.292 [29286] Async_NotifyFrontEnd
980127.19:52:14.413 [29286] Async_NotifyFrontEnd done
980127.19:52:14.466 [29286] Async_NotifyHandler done
    </programlisting>
   </para>

   <para>
    This format improves readability of the logs and allows people to understand
    exactly which backend is doing what and at which time. It also makes
    easier to write simple awk or perl scripts which monitor the log to
    detect database errors or problem, or to compute transaction time statistics.
   </para>

   <para>
    Messages printed to syslog use the log facility LOG_LOCAL0.
    The use of syslog can be controlled with the syslog pg_option.
    Unfortunately many functions call directly <function>printf()</function>
    to print their messages to stdout or stderr and this output can't be
    redirected to syslog or have timestamps in it. 
    It would be advisable that all calls to printf would be replaced with the
    PRINTF macro and output to stderr be changed to use EPRINTF instead so that
    we can control all output in a uniform way.
   </Para>

    <para>
     The format of the <filename>pg_options</filename> file is as follows:

     <programlisting>
# <replaceable>comment</replaceable>
<replaceable>option</replaceable>=<replaceable class="parameter">integer_value</replaceable>  # set value for <replaceable>option</replaceable>
<replaceable>option</replaceable>                # set <replaceable>option</replaceable> = 1
<replaceable>option</replaceable>+               # set <replaceable>option</replaceable> = 1
<replaceable>option</replaceable>-               # set <replaceable>option</replaceable> = 0
     </programlisting>

     Note that <replaceable class="parameter">keyword</replaceable> can also be
     an abbreviation of the option name defined in
     <filename>backend/utils/misc/trace.c</filename>.
    </Para>

    <Para>
     Refer to <xref linkend="pg-options-title" endterm="pg-options-title">
     for a complete list of option keywords and possible values.
    </para>
   </sect2>
  </sect1>
 </Chapter>

<!-- Keep this comment at the end of the file
Local variables:
mode: sgml
sgml-omittag:nil
sgml-shorttag:t
sgml-minimize-attributes:nil
sgml-always-quote-attributes:t
sgml-indent-step:1
sgml-indent-data:t
sgml-parent-document:nil
sgml-default-dtd-file:"./reference.ced"
sgml-exposed-tags:nil
sgml-local-catalogs:"/usr/lib/sgml/CATALOG"
sgml-local-ecat-files:nil
End:
-->