diff options
Diffstat (limited to 'doc/src/sgml/ecpg.sgml')
-rw-r--r-- | doc/src/sgml/ecpg.sgml | 836 |
1 files changed, 836 insertions, 0 deletions
diff --git a/doc/src/sgml/ecpg.sgml b/doc/src/sgml/ecpg.sgml new file mode 100644 index 00000000000..32fd459e525 --- /dev/null +++ b/doc/src/sgml/ecpg.sgml @@ -0,0 +1,836 @@ +<Chapter> +<DocInfo> +<AuthorGroup> +<Author> +<FirstName>Linux</FirstName> +<Surname>Tolke</Surname> +</Author> +<Author> +<FirstName>Michael</FirstName> +<Surname>Meskes</Surname> +</Author> +</AuthorGroup> +<Copyright> +<Year>1996-1997</Year> +<Holder>Linus Tolke</Holder> +</Copyright> +<Copyright> +<Year>1998</Year> +<Holder>Michael Meskes</Holder> +</Copyright> +<Date>Transcribed 1998-02-12</Date> +</DocInfo> + +<Title><Application>ecpg</Application> - Embedded <Acronym>SQL</Acronym> in <Acronym>C</Acronym></Title> + +<Para> +This describes an embedded <Acronym>SQL</Acronym> in <Acronym>C</Acronym> package for <ProductName>Postgres</ProductName>. + +It is written by <ULink url="mailto:linus@epact.se">Linus Tolke</ULink> +and <ULink url="mailto:meskes@debian.org">Michael Meskes</ULink>. + +<Note> +<Para> +Permission is granted to copy and use in the same way as you are allowed +to copy and use the rest of the <ProductName>PostgreSQL</ProductName>. +</Para> +</Note> + +<Sect1> +<Title>Why Embedded <Acronym>SQL</Acronym>?</Title> + +<Para> +Embedded <Acronym>SQL</Acronym> has some small advantages over other ways to handle <Acronym>SQL</Acronym> +queries. It takes care of all the tedious moving of information to and +from variables in your <Acronym>C</Acronym> program. Many <Acronym>RDBMS</Acronym> packages +support this embedded language. + +<Para> +There is an ANSI-standard describing how the embedded language should +work. Most embedded <Acronym>SQL</Acronym> preprocessors I have seen and heard of make +extensions so it is difficult to obtain portability between them +anyway. I have not read the standard but I hope that my implementation +does not deviate too much and that it would be possible to port programs +with embedded <Acronym>SQL</Acronym> written for other <Acronym>RDBMS</Acronym> packages +to <ProductName>Postgres</ProductName> and thus +promoting the spirit of free software. + +<Sect1> +<Title>The Concept</Title> + +<Para> +You write your program in <Acronym>C</Acronym> with some special <Acronym>SQL</Acronym> things. +For declaring variables that can be used in <Acronym>SQL</Acronym> statements you need to +put them in a special declare section. +You use a special syntax for the <Acronym>SQL</Acronym> queries. + +<Para> +Before compiling you run the file through the embedded <Acronym>SQL</Acronym> <Acronym>C</Acronym> +preprocessor and it converts the <Acronym>SQL</Acronym> statements you used to function +calls with the variables used as arguments. Both variables that are used +as input to the <Acronym>SQL</Acronym> statements and variables that will contain the +result are passed. + +<Para> +Then you compile and at link time you link with a special library that +contains the functions used. These functions (actually it is mostly one +single function) fetches the information from the arguments, performs +the <Acronym>SQL</Acronym> query using the ordinary interface (<FileName>libpq</FileName>) and puts back +the result in the arguments dedicated for output. + +<Para> +Then you run your program and when the control arrives to the <Acronym>SQL</Acronym> +statement the <Acronym>SQL</Acronym> statement is performed against the database and you +can continue with the result. + + +<Sect1> +<Title>How To Use <Application>egpc</Application></Title> + +<Para> +This section describes how to use the <Application>egpc</Application> tool. + +<Sect2> +<Title>Preprocessor + +<Para> +The preprocessor is called <Application>ecpg</Application>. After installation it resides in +the <ProductName>Postgres</ProductName> <FileName>bin/</FileName> directory. + +<Sect2> +<Title>Library + +<Para> +The <Application>ecpg</Application> library is called <FileName>libecpg.a</FileName> or +<FileName>libecpg.so</FileName>. Additionally, the library +uses the <FileName>libpq</FileName> library for communication to the +<ProductName>Postgres</ProductName> server so you will +have to link your program with <Parameter>-lecpg -lpq</Parameter>. + +<Para> +The library has some methods that are "hidden" but that could prove very +useful sometime. + +<VariableList> +<VarListEntry> +<Term>ECPGdebug(int, FILE *stream)</Term> +<ListItem> +<Para> +If this is called, with the first argument non-zero, then debuglogging is turned +on. Debuglogging is done on <Function>stream</Function>. Most <Acronym>SQL</Acronym> statement logs its +arguments and result. + +<Para> +The most important one (<Function>ECPGdo</Function>) that is called on all <Acronym>SQL</Acronym> +statements except <Command>EXEC SQL COMMIT</Command>, <Command>EXEC SQL ROLLBACK</Command>, +<Command>EXEC SQL CONNECT</Command> logs both its expanded string, i.e. the string +with all the input variables inserted, and the result from the +<ProductName>Postgres</ProductName> server. This can be very useful when searching for errors +in your <Acronym>SQL</Acronym> statements. +</Para> +</ListItem> +</VarListEntry> + +<VarListEntry> +<Term>ECPGstatus()</Term> +<ListItem> +<Para> +This method returns TRUE if we are connected to a database and FALSE if not. +</Para> +</ListItem> +</VarListEntry> +</VariableList> + +<Sect2> +<Title>Error handling + +<Para> +To be able to detect errors from the <ProductName>Postgres</ProductName> server you include a line +like +<ProgramListing> +exec sql include sqlca; +</ProgramListing> +in the include section of your file. This will define a struct and a +variable with the name <Parameter>sqlca</Parameter> as following: +<ProgramListing> +struct sqlca { + int sqlcode; + struct { + int sqlerrml; + char sqlerrmc[1000]; + } sqlerrm; +} sqlca; +</ProgramListing> + +<Para> +If an error occured in the last <Acronym>SQL</Acronym> statement then <Parameter>sqlca.sqlcode</Parameter> +will be non-zero. If <Parameter>sqlca.sqlcode</Parameter> is less that 0 then this is +some kind of serious error, like the database definition does not match +the query given. If it is bigger than 0 then this is a normal error like +the table did not contain the requested row. + +<Para> +sqlca.sqlerrm.sqlerrmc will contain a string that describes the error. +The string ends with <Quote>line 23.</Quote> where the line is the line number +in the source file (actually the file generated by the preprocessor but +I hope I can fix this to be the line number in the input file.) + +<Para> +List of errors that can occur: + +<VariableList> +<VarListEntry> +<Term>-1, Unsupported type %s on line %d.</Term> +<ListItem> +<Para> +Does not normally occur. This is a sign that the preprocessor has +generated something that the library does not know about. Perhaps you +are running incompatible versions of the preprocessor and the library. +</Para> +</ListItem> +</VarListEntry> + +<VarListEntry> +<Term>-1, Too many arguments line %d.</Term> +<ListItem> +<Para> +The preprocessor has goofed up and generated some incorrect code. +</Para> +</ListItem> +</VarListEntry> + +<VarListEntry> +<Term>-1, Too few arguments line %d.</Term> +<ListItem> +<Para> +The preprocessor has goofed up and generated some incorrect code. +</Para> +</ListItem> +</VarListEntry> + +<VarListEntry> +<Term>-1, Error starting transaction line %d.</Term> +<ListItem> +<Para> +<ProductName>Postgres</ProductName> signalled to us that we cannot open the connection. +</Para> +</ListItem> +</VarListEntry> + +<VarListEntry> +<Term>-1, Postgres error: %s line %d.</Term> +<ListItem> +<Para> +Some <ProductName>Postgres</ProductName> error. The message contains the error message from the +<ProductName>Postgres</ProductName> backend. +</Para> +</ListItem> +</VarListEntry> + +<VarListEntry> +<Term>1, Data not found line %d.</Term> +<ListItem> +<Para> +This is a "normal" error that tells you that what you are quering cannot +be found or we have gone through the cursor. +</Para> +</ListItem> +</VarListEntry> + +<VarListEntry> +<Term>-1, To many matches line %d.</Term> +<ListItem> +<Para> +This means that the query has returned several lines. The <Command>SELECT</Command> +you made probably was not unique. +</Para> +</ListItem> +</VarListEntry> + +<VarListEntry> +<Term>-1, Not correctly formatted int type: %s line %d.</Term> +<ListItem> +<Para> +This means that the host variable is of an <Type>int</Type> type and the field +in the <ProductName>Postgres</ProductName> database is of another type and contains a value that +cannot be interpreted as an <Type>int</Type>. The library uses <Function>strtol</Function> +for this conversion. +</Para> +</ListItem> +</VarListEntry> + +<VarListEntry> +<Term>-1, Not correctly formatted unsigned type: %s line %d.</Term> +<ListItem> +<Para> +This means that the host variable is of an <Type>unsigned int</Type> type and +the field in the <ProductName>Postgres</ProductName> database is of another type and contains a +value that cannot be interpreted as an <Type>unsigned int</Type>. The library +uses <Function>strtoul</Function> for this conversion. +</Para> +</ListItem> +</VarListEntry> + +<VarListEntry> +<Term>-1, Not correctly formatted floating point type: %s line %d.</Term> +<ListItem> +<Para> +This means that the host variable is of an <Type>float</Type> type and +the field in the <ProductName>Postgres</ProductName> database is of another type and contains a +value that cannot be interpreted as an <Type>float</Type>. The library +uses <Function>strtod</Function> for this conversion. +</Para> +</ListItem> +</VarListEntry> + +<VarListEntry> +<Term>-1, Too few arguments line %d.</Term> +<ListItem> +<Para> +This means that <ProductName>Postgres</ProductName> has returned more records than we have +matching variables. Perhaps you have forgotten a couple of the host +variables in the <Command>INTO :var1,:var2</Command>-list. +</Para> +</ListItem> +</VarListEntry> + +<VarListEntry> +<Term>-1, Too many arguments line %d.</Term> +<ListItem> +<Para> +This means that <ProductName>Postgres</ProductName> has returned fewer records than we have +host variables. Perhaps you have to many host variables in the +<Command>INTO :var1,:var2</Command>-list. +</Para> +</ListItem> +</VarListEntry> + +<VarListEntry> +<Term>-1, Empty query line %d.</Term> +<ListItem> +<Para> +<ProductName>Postgres</ProductName> returned PGRES_EMPTY_QUERY. +</Para> +</ListItem> +</VarListEntry> + +<VarListEntry> +<Term>-1, Error: %s line %d.</Term> +<ListItem> +<Para> +This means that <ProductName>Postgres</ProductName> returned on of the errors +PGRES_NONFATAL_ERROR, PGRES_FATAL_ERROR or PGRES_BAD_RESPONSE. Which one +and why is explained in the message. +</Para> +</ListItem> +</VarListEntry> + +<VarListEntry> +<Term>-1, Postgres error line %d.</Term> +<ListItem> +<Para> +<ProductName>Postgres</ProductName> returns something that the library does not know how to +handle. This is probably because the version of <ProductName>Postgres</ProductName> does not +match the version of the <Application>ecpg</Application> library. +</Para> +</ListItem> +</VarListEntry> + +<VarListEntry> +<Term>-1, Error committing line %d.</Term> +<ListItem> +<Para> +Error during <Command>COMMIT</Command>. <Command>EXEC SQL COMMIT</Command> is translated to an +<Command>end</Command> operation in <ProductName>Postgres</ProductName> and that is the operation that could +not be performed. +</Para> +</ListItem> +</VarListEntry> + +<VarListEntry> +<Term>-1, Error rolling back line %d.</Term> +<ListItem> +<Para> +Error during <Command>ROLLBACK</Command>. <Command>EXEC SQL ROLLBACK</Command> is translated to +an <Command>abort</Command> operation in <ProductName>Postgres</ProductName> and that is the operation that +could not be performed. +</Para> +</ListItem> +</VarListEntry> + +<VarListEntry> +<Term>-1, ECPGconnect: could not open database %s.</Term> +<ListItem> +<Para> +The connect to the database did not work. +</Para> +</ListItem> +</VarListEntry> +</VariableList> + +</Sect2> + +<Sect1> +<Title>Limitations</Title> + +<Para> +What will never be included and why or what cannot be done with this +concept. + +<VariableList> +<VarListEntry> +<Term>oracles single tasking possibility</Term> +<ListItem> +<Para> +Oracle version 7.0 on AIX 3 uses the OS-supported locks on the shared +memory segments and allows the application designer to link an +application in a so called single tasking way. Instead of starting one +client process per application process both the database part and the +application part is run in the same process. In later versions of oracle +this is no longer supported. + +<Para> +This would require a total redesign of the <ProductName>Postgres</ProductName> access model and +that effort can not justify the performance gained. +</Para> +</ListItem> +</VarListEntry> +</VariableList> + +<Sect1> +<Title>Porting From Other <Acronym>RDBMS</Acronym> Packages</Title> + +<Para> +To be written by persons that knows the different <Acronym>RDBMS</Acronym> packages and that +actually does port something... + +<Sect1> +<Title>Installation</Title> + +<Para> +Since version 0.5 <Application>ecpg</Application> is distributed together with <ProductName>Postgres</ProductName>. So you +should get your precompiler, libraries and header files compiled and +installed on the fly. + +<Sect1> +<Title>For the Developer</Title> + +<Para> +This section is for those that wants to develop the <Application>ecpg</Application> interface. It +describes how the things work. The ambition is to make this section +contain things for those that want to have a look inside and the section +on How to use it should be enough for all normal questions. + +So, read this before looking at the internals of the <Application>ecpg</Application>. If +you are not interested in how it really works, skip this section. + +<Sect2> +<Title>ToDo List</Title> + +<Para> +This version the preprocessor has some flaws: + +<VariableList> +<VarListEntry> +<Term>Preprocessor output</Term> +<ListItem> +<Para> +The variables should be static. +</Para> +</ListItem> +</VarListEntry> + +<VarListEntry> +<Term>Preprocessor cannot do syntax checking on your <Acronym>SQL</Acronym> statements</Term> +<ListItem> +<Para> +Whatever you write is copied more or less exactly to the <ProductName>Postgres</ProductName> and +you will not be able to locate your errors until run-time. +</Para> +</ListItem> +</VarListEntry> + +<VarListEntry> +<Term>no restriction to strings only</Term> +<ListItem> +<Para> +The PQ interface, and most of all the PQexec function, that is used by +the <Application>ecpg</Application> relies on that the request is built up as a string. In some +cases, like when the data contains the null character, this will be a +serious problem. +</Para> +</ListItem> +</VarListEntry> + +<VarListEntry> +<Term>error codes</Term> +<ListItem> +<Para> +There should be different error numbers for the different errors instead +of just -1 for them all. +</Para> +</ListItem> +</VarListEntry> + +<VarListEntry> +<Term>library functions</Term> +<ListItem> +<Para> +to_date et al. +</Para> +</ListItem> +</VarListEntry> + +<VarListEntry> +<Term>records</Term> +<ListItem> +<Para> +Possibility to define records or structures in the declare section +in a way that the record can be filled from one row in the database. + +<Para> +This is a simpler way to handle an entire row at a time. +</Para> +</ListItem> +</VarListEntry> + +<VarListEntry> +<Term>array operations</Term> +<ListItem> +<Para> +Oracle has array operations that enhances speed. When implementing it in +<Application>ecpg</Application> it is done for compatibility reasons only. For them to +improve speed would require a lot more insight in the <ProductName>Postgres</ProductName> internal +mechanisms than I possess. +</Para> +</ListItem> +</VarListEntry> + +<VarListEntry> +<Term>indicator variables</Term> +<ListItem> +<Para> +Oracle has indicator variables that tell if a value is <Type>null</Type> or if +it is empty. This largely simplifies array operations and provides for a +way to hack around some design flaws in the handling of <Type>VARCHAR2</Type> +(like that an empty string isn't distinguishable from a +<Type>null</Type> value). I am not sure if this is an Oracle extension or part +of the ANSI standard. +</Para> +</ListItem> +</VarListEntry> + +<VarListEntry> +<Term>typedefs</Term> +<ListItem> +<Para> +As well as complex types like records and arrays, typedefs would be +a good thing to take care of. +</Para> +</ListItem> +</VarListEntry> + +<VarListEntry> +<Term>conversion of scripts</Term> +<ListItem> +<Para> +To set up a database you need a few scripts with table definitions and +other configuration parameters. If you have these scripts for an old +database you would like to just apply them to get a <ProductName>Postgres</ProductName> database +that works in the same way. + +<Para> +To set up a database you need a few scripts with table definitions and +The functionality could be accomplished with some conversion scripts. +Speed will never be accomplished in this way. To do this you need a +bigger insight in the database construction and the use of the database +than could be realised in a script. +</Para> +</ListItem> +</VarListEntry> +</VariableList> + +<Sect2> +<Title>The Preprocessor</Title> + +<Para> +First four lines are written to the output. Two comments and two include +lines necessary for the interface to the library. + +<Para> +Then the preprocessor works in one pass only reading the input file and +writing to the output as it goes along. Normally it just echoes +everything to the output without looking at it further. + +<Para> +When it comes to an <Command>EXEC SQL</Command> statements it interviens and +changes them depending on what iit is. The <Command>EXEC SQL</Command> statement can +be one of these: + +<VariableList> +<VarListEntry> +<Term>Declare sections</Term> +<ListItem> +<Para> +Declare sections begins with +<ProgramListing> +exec sql begin declare section; +</ProgramListing> +and ends with +<ProgramListing> +exec sql end declare section; +</ProgramListing> +In the section only variable declarations are allowed. Every variable +declare within this section is also entered in a list of variables +indexed on their name together with the corresponding type. + +<Para> +The declaration is echoed to the file to make the variable a normal +C-variable also. + +<Para> +The special types VARCHAR and VARCHAR2 are converted into a named struct +for every variable. A declaration like: +<ProgramListing> +VARCHAR var[180]; +</ProgramListing> +is converted into +<ProgramListing> +struct varchar_var { int len; char arr[180]; } var; +</ProgramListing> +</Para> +</ListItem> +</VarListEntry> + +<VarListEntry> +<Term>Include statements</Term> +<ListItem> +<Para> +An include statement looks like: +<ProgramListing> +exec sql include filename; +</ProgramListing> +It is converted into +<ProgramListing> +#include <filename.h> +</ProgramListing> +</Para> +</ListItem> +</VarListEntry> + +<VarListEntry> +<Term>Connect statement</Term> +<ListItem> +<Para> +A connect statement looks like: +<ProgramListing> +exec sql connect '<Replaceable>database</Replaceable>'; +</ProgramListing> +That statement is converted into +<ProgramListing> +ECPGconnect("<Replaceable>database</Replaceable>"); +</ProgramListing> +</Para> +</ListItem> +</VarListEntry> + +<VarListEntry> +<Term>Open cursor statement</Term> +<ListItem> +<Para> +An open cursor statement looks like: +<ProgramListing> +exec sql open <Replaceable>cursor</Replaceable>; +</ProgramListing> +and is ignore and not copied from the output. +</Para> +</ListItem> +</VarListEntry> + +<VarListEntry> +<Term>Commit statement</Term> +<ListItem> +<Para> +A commit statement looks like +<ProgramListing> +exec sql commit; +</ProgramListing> +and is translated on the output to +<ProgramListing> +ECPGcommit(__LINE__); +</ProgramListing> +</Para> +</ListItem> +</VarListEntry> + +<VarListEntry> +<Term>Rollback statement</Term> +<ListItem> +<Para> +A rollback statement looks like +<ProgramListing> +exec sql rollback; +</ProgramListing> +and is translated on the output to +<ProgramListing> +ECPGrollback(__LINE__); +</ProgramListing> +</Para> +</ListItem> +</VarListEntry> + +<VarListEntry> +<Term>Other statements</Term> +<ListItem> +<Para> +Other <Acronym>SQL</Acronym> statements are other statements that start with +<Command>exec sql</Command> and ends with <Command>;</Command>. Everything inbetween is treated +as an <Acronym>SQL</Acronym> statement and parsed for variable substitution. + +<Para> +Variable substitution occur when a symbol starts with a colon +(<Command>:</Command>). Then a variable with that name is found among the variables +that were previously declared within a declare section and depending on +whether or not the <Acronym>SQL</Acronym> statements knows it to be a variable for input or +output the pointers to the variables are written to the output to allow +for access by the function. + +<Para> +For every variable that is part of the <Acronym>SQL</Acronym> request the function gets +another five arguments. + +<SimpleList> +<Member>The type as a special symbol</Member> +<Member>A pointer to the value</Member> +<Member>The size of the variable if it is a varchar</Member> +<Member>Number of elements in the array (for array fetches)</Member> +<Member>The offset to the next element in the array (for array fetches)</Member> +</SimpleList> + +<Para> +Since the array fetches are not implemented yet the two last arguments +are not really important. They could perhaps have been left out. +</Para> +</ListItem> +</VarListEntry> +</VariableList> + +</Sect2> + +<Sect2> +<Title>A Complete Example</Title> + +<Para> +Here is a complete example describing the output of the preprocessor: +<ProgramListing> +exec sql begin declare section; +int index; +int result; +exec sql end declare section; +... + exec sql select res into :result from mytable where index = :index; +</ProgramListing> +is translated into: +<ProgramListing> +/* These two include files are added by the preprocessor */ +#include <ecpgtype.h> +#include <ecpglib.h> +/* exec sql begin declare section */ + + int index; + int result; +/* exec sql end declare section */ + +... + ECPGdo(__LINE__, "select res from mytable where index = ;;", + ECPGt_int,&index,0,0,sizeof(int), + ECPGt_EOIT, + ECPGt_int,&result,0,0,sizeof(int), + ECPGt_EORT ); +</ProgramListing> +(the indentation in this manual is added for readability and not +something that the preprocessor can do.) + +<Sect2> +<Title>The Library</Title> + +<Para> +The most important function in the library is the <Function>ECPGdo</Function> +function. It takes a variable amount of arguments. Hopefully we wont run +into machines with limits on the amount of variables that can be +accepted by a varchar function. This could easily add up to 50 or so +arguments. + +<Para> +The arguments are: + +<VariableList> +<VarListEntry> +<Term>A line number</Term> +<ListItem> +<Para> +This is a line number for the original line used in error messages only. +</Para> +</ListItem> +</VarListEntry> + +<VarListEntry> +<Term>A string</Term> +<ListItem> +<Para> +This is the <Acronym>SQL</Acronym> request that is to be issued. This request is modified +by the input variables, i.e. the variables that where not known at +compile time but are to be entered in the request. Where the variables +should go the string contains <Quote>;</Quote>. +</Para> +</ListItem> +</VarListEntry> + +<VarListEntry> +<Term>Input variables</Term> +<ListItem> +<Para> +As described in the section about the preprocessor every input variable +gets five arguments. +</Para> +</ListItem> +</VarListEntry> + +<VarListEntry> +<Term>ECPGt_EOIT</Term> +<ListItem> +<Para> +An enum telling that there are no more input variables. +</Para> +</ListItem> +</VarListEntry> + +<VarListEntry> +<Term>Output variables</Term> +<ListItem> +<Para> +As described in the section about the preprocessor every input variable +gets five arguments. These variables are filled by the function. +</Para> +</ListItem> +</VarListEntry> + +<VarListEntry> +<Term>ECPGt_EORT</Term> +<ListItem> +<Para> +An enum telling that there are no more variables. +</Para> +</ListItem> +</VarListEntry> +</VariableList> + +<Para> +All the <Acronym>SQL</Acronym> statements are performed in one transaction unless you issue +a commit transaction. This works so that the first transaction or the +first after a commit or rollback always begins a transaction. + +<Para> +To be completed: entries describing the other entries. + +</Chapter> |