aboutsummaryrefslogtreecommitdiff
path: root/src/backend/jit/llvm/llvmjit_expr.c
Commit message (Collapse)AuthorAge
...
* Extend ExecBuildAggTrans() to support a NULL pointer check.Jeff Davis2020-03-04
| | | | | | | | | | | | | | | | | Optionally push a step to check for a NULL pointer to the pergroup state. This will be important for disk-based hash aggregation in combination with grouping sets. When memory limits are reached, a given tuple may find its per-group state for some grouping sets but not others. For the former, it advances the per-group state as normal; for the latter, it skips evaluation and the calling code will have to spill the tuple and reprocess it in a later batch. Add the NULL check as a separate expression step because in some common cases it's not needed. Discussion: https://postgr.es/m/20200221202212.ssb2qpmdgrnx52sj%40alap3.anarazel.de
* expression eval: Reduce number of steps for agg transition invocations.Andres Freund2020-02-24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Do so by combining the various steps that are part of aggregate transition function invocation into one larger step. As some of the current steps are only necessary for some aggregates, have one variant of the aggregate transition step for each possible combination. To avoid further manual copies of code in the different transition step implementations, move most of the code into helper functions marked as "always inline". The benefit of this change is an increase in performance when aggregating lots of rows. This comes in part due to the reduced number of indirect jumps due to the reduced number of steps, and in part by reducing redundant setup code across steps. This mainly benefits interpreted execution, but the code generated by JIT is also improved a bit. As a nice side-effect it also ends up making the code a bit simpler. A small additional optimization is removing the need to set aggstate->curaggcontext before calling ExecAggInitGroup, choosing to instead passign curaggcontext as an argument. It was, in contrast to other aggregate related functions, only needed to fetch a memory context to copy the transition value into. Author: Andres Freund Discussion: https://postgr.es/m/20191023163849.sosqbfs5yenocez3@alap3.anarazel.de https://postgr.es/m/5c371df7cee903e8cd4c685f90c6c72086d3a2dc.camel@j-davis.com
* jit: Reference expression step functions via llvmjit_types.Andres Freund2020-02-06
| | | | | | | | | | | | | | The main benefit of doing so is that this allows llvm to ensure that types match - previously that'd only be detected by a crash within the called function. There were a number of cases where we passed a superfluous parameter... To avoid needing to add all the functions to llvmjit.{c,h}, instead get them from the llvm module for llvmjit_types.c. Also use that for the functions from llvmjit_types already in llvmjit.h. Author: Soumyadeep Chakraborty and Andres Freund Discussion: https://postgr.es/m/CADwEdooww3wZv-sXSfatzFRwMuwa186LyTwkBfwEW6NjtooBPA@mail.gmail.com
* jit: Remove redundancies in expression evaluation code generation.Andres Freund2020-02-06
| | | | | | | | | | This merges the code emission for a number of opcodes by handling the behavioural difference more locally. This reduces code, and also improves the generated code a bit in some cases, by removing redundant constants. Author: Andres Freund Discussion: https://postgr.es/m/20191023163849.sosqbfs5yenocez3@alap3.anarazel.de
* jit: Reference functions by name in IOCOERCE steps.Andres Freund2020-02-06
| | | | | | | | Previously we used constant function pointer addresses, which prevents inlining and other related optimizations. Author: Andres Freund Discussion: https://postgr.es/m/20191023163849.sosqbfs5yenocez3@alap3.anarazel.de
* expression eval: Don't redundantly keep track of AggState.Andres Freund2020-02-06
| | | | | | | | | | It's already tracked via ExprState->parent, so we don't need to also include it in ExprEvalStep. When that code originally was written ExprState->parent didn't exist, but it since has been introduced in 6719b238e8f. Author: Andres Freund Discussion: https://postgr.es/m/20191023163849.sosqbfs5yenocez3@alap3.anarazel.de
* expression eval, jit: Minor code cleanups.Andres Freund2020-02-06
| | | | | | | | | | This mostly consists of using C99 style for loops, moving variables into narrower scopes, and a smattering of other minor improvements. Done separately to make it easier to review patches with actual functional changes. Author: Andres Freund Discussion: https://postgr.es/m/20191023163849.sosqbfs5yenocez3@alap3.anarazel.de
* Update copyrights for 2020Bruce Momjian2020-01-01
| | | | Backpatch-through: update all files in master, backpatch legal files through 9.4
* Refactor attribute mappings used in logical tuple conversionMichael Paquier2019-12-18
| | | | | | | | | | | | | | | | | | | | | | | | | | | Tuple conversion support in tupconvert.c is able to convert rowtypes between two relations, inner and outer, which are logically equivalent but have a different ordering or even dropped columns (used mainly for inheritance tree and partitions). This makes use of attribute mappings, which are simple arrays made of AttrNumber elements with a length matching the number of attributes of the outer relation. The length of the attribute mapping has been treated as completely independent of the mapping itself until now, making it easy to pass down an incorrect mapping length. This commit refactors the code related to attribute mappings and moves it into an independent facility called attmap.c, extracted from tupconvert.c. This merges the attribute mapping with its length, avoiding to try to guess what is the length of a mapping to use as this is computed once, when the map is built. This will avoid mistakes like what has been fixed in dc816e58, which has used an incorrect mapping length by matching it with the number of attributes of an inner relation (a child partition) instead of an outer relation (a partitioned table). Author: Michael Paquier Reviewed-by: Amit Langote Discussion: https://postgr.es/m/20191121042556.GD153437@paquier.xyz
* Make the order of the header file includes consistent in backend modules.Amit Kapila2019-11-12
| | | | | | | | | | | Similar to commits 7e735035f2 and dddf4cdc33, this commit makes the order of header file inclusion consistent for backend modules. In the passing, removed a couple of duplicate inclusions. Author: Vignesh C Reviewed-by: Kuntal Ghosh and Amit Kapila Discussion: https://postgr.es/m/CALDaNm2Sznv8RR6Ex-iJO6xAdsxgWhCoETkaYX=+9DW3q0QCfA@mail.gmail.com
* Don't generate EEOP_*_FETCHSOME operations for slots know to be virtual.Andres Freund2019-09-30
| | | | | | | | | | | | That avoids unnecessary work during both interpreted execution, and JIT compiled expression evaluation. Both benefit from fewer expression steps needing be processed, and for interpreted execution there now is a fastpath dedicated to just fetching a value from a virtual slot. That's e.g. beneficial for hashjoins over nodes that perform projections, as the hashed columns are currently fetched individually. Author: Soumyadeep Chakraborty, Andres Freund Discussion: https://postgr.es/m/CAE-ML+9OKSN71+mHtfMD-L24oDp8dGTfaVjDU6U+j+FNAW5kRQ@mail.gmail.com
* Phase 2 pgindent run for v12.Tom Lane2019-05-22
| | | | | | | | | Switch to 2.1 version of pg_bsd_indent. This formats multiline function declarations "correctly", that is with additional lines of parameter declarations indented to match where the first line's left parenthesis is. Discussion: https://postgr.es/m/CAEepm=0P3FeTXRcU5B2W3jv3PgRVZ-kGUXLGfd42FFhUROO3ug@mail.gmail.com
* Renaming for new subscripting mechanismAlvaro Herrera2019-02-01
| | | | | | | | | | | | Over at patch https://commitfest.postgresql.org/21/1062/ Dmitry wants to introduce a more generic subscription mechanism, which allows subscripting not only arrays but also other object types such as JSONB. That functionality is introduced in a largish invasive patch, out of which this internal renaming patch was extracted. Author: Dmitry Dolgov Reviewed-by: Tom Lane, Arthur Zakirov Discussion: https://postgr.es/m/CA+q6zcUK4EqPAu7XRRO5CCjMwhz5zvg+rfWuLzVoxp_5sKS6=w@mail.gmail.com
* Refactor planner's header files.Tom Lane2019-01-29
| | | | | | | | | | | | | | | | | | | | | | | | Create a new header optimizer/optimizer.h, which exposes just the planner functions that can be used "at arm's length", without need to access Paths or the other planner-internal data structures defined in nodes/relation.h. This is intended to provide the whole planner API seen by most of the rest of the system; although FDWs still need to use additional stuff, and more thought is also needed about just what selfuncs.c should rely on. The main point of doing this now is to limit the amount of new #include baggage that will be needed by "planner support functions", which I expect to introduce later, and which will be in relevant datatype modules rather than anywhere near the planner. This commit just moves relevant declarations into optimizer.h from other header files (a couple of which go away because everything got moved), and adjusts #include lists to match. There's further cleanup that could be done if we want to decide that some stuff being exposed by optimizer.h doesn't belong in the planner at all, but I'll leave that for another day. Discussion: https://postgr.es/m/11460.1548706639@sss.pgh.pa.us
* Change function call information to be variable length.Andres Freund2019-01-26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Before this change FunctionCallInfoData, the struct arguments etc for V1 function calls are stored in, always had space for FUNC_MAX_ARGS/100 arguments, storing datums and their nullness in two arrays. For nearly every function call 100 arguments is far more than needed, therefore wasting memory. Arg and argnull being two separate arrays also guarantees that to access a single argument, two cachelines have to be touched. Change the layout so there's a single variable-length array with pairs of value / isnull. That drastically reduces memory consumption for most function calls (on x86-64 a two argument function now uses 64bytes, previously 936 bytes), and makes it very likely that argument value and its nullness are on the same cacheline. Arguments are stored in a new NullableDatum struct, which, due to padding, needs more memory per argument than before. But as usually far fewer arguments are stored, and individual arguments are cheaper to access, that's still a clear win. It's likely that there's other places where conversion to NullableDatum arrays would make sense, e.g. TupleTableSlots, but that's for another commit. Because the function call information is now variable-length allocations have to take the number of arguments into account. For heap allocations that can be done with SizeForFunctionCallInfoData(), for on-stack allocations there's a new LOCAL_FCINFO(name, nargs) macro that helps to allocate an appropriately sized and aligned variable. Some places with stack allocation function call information don't know the number of arguments at compile time, and currently variably sized stack allocations aren't allowed in postgres. Therefore allow for FUNC_MAX_ARGS space in these cases. They're not that common, so for now that seems acceptable. Because of the need to allocate FunctionCallInfo of the appropriate size, older extensions may need to update their code. To avoid subtle breakages, the FunctionCallInfoData struct has been renamed to FunctionCallInfoBaseData. Most code only references FunctionCallInfo, so that shouldn't cause much collateral damage. This change is also a prerequisite for more efficient expression JIT compilation (by allocating the function call information on the stack, allowing LLVM to optimize it away); previously the size of the call information caused problems inside LLVM's optimizer. Author: Andres Freund Reviewed-By: Tom Lane Discussion: https://postgr.es/m/20180605172952.x34m5uz6ju6enaem@alap3.anarazel.de
* Update copyright for 2019Bruce Momjian2019-01-02
| | | | Backpatch-through: certain files through 9.4
* Inline hot path of slot_getsomeattrs().Andres Freund2018-11-16
| | | | | | | | This yields a minor speedup, which roughly balances the loss from the upcoming introduction of callbacks to do some operations on slots. Author: Andres Freund Discussion: https://postgr.es/m/20181105210039.hh4vvi4vwoq5ba2q@alap3.anarazel.de
* Don't generate tuple deforming functions for virtual slots.Andres Freund2018-11-15
| | | | | | | | | | Virtual tuple table slots never need tuple deforming. Therefore, if we know at expression compilation time, that a certain slot will always be virtual, there's no need to create a tuple deforming routine for it. Author: Andres Freund Discussion: https://postgr.es/m/20181105210039.hh4vvi4vwoq5ba2q@alap3.anarazel.de
* Compute information about EEOP_*_FETCHSOME at expression init time.Andres Freund2018-11-15
| | | | | | | | | | | | | | Previously this information was computed when JIT compiling an expression. But the information is useful for assertions in the non-JIT case too (for assertions), therefore it makes sense to move it. This will, in a followup commit, allow to treat different slot types differently. E.g. for virtual slots there's no need to generate a JIT function to deform the slot. Author: Andres Freund Discussion: https://postgr.es/m/20181105210039.hh4vvi4vwoq5ba2q@alap3.anarazel.de
* Fixup for b84a6dafbf triggering assert failure in LLVM debug builds.Andres Freund2018-11-07
| | | | Author: Andres Freund
* Move EEOP_*_SYSVAR evaluation out of line.Andres Freund2018-11-07
| | | | | | | | | | | | | This mainly de-duplicates code. As evaluating a system variable isn't the hottest path and the current inline implementation ends up calling out to an external function anyway, this is OK from a performance POV. The main motivation for de-duplicating is the upcoming slot abstraction work, after which there's not guaranteed to be a HeapTuple backing the slot. Author: Andres Freund, Amit Khandekar Discussion: https://postgr.es/m/20181105210039.hh4vvi4vwoq5ba2q@alap3.anarazel.de
* Prevent generating EEOP_AGG_STRICT_INPUT_CHECK operations when nargs == 0.Andres Freund2018-11-03
| | | | | | | | | | | | | | This only became a problem with 4c640f4f38, which didn't synchronize the value agg_strict_input_check.nargs is set to, with the guard condition for emitting the operation. Besides such instructions being unnecessary overhead, currently the LLVM JIT provider doesn't support them. It seems more sensible to avoid generating such instruction than supporting them. Add assertions to make it easier to debug a potential further occurance. Discussion: https://postgr.es/m/2a505161-2727-2473-7c46-591ed108ac52@email.cz Backpatch: 11-, like 4c640f4f38.
* Move TupleTableSlots boolean member into one flag variable.Andres Freund2018-10-15
| | | | | | | | | | | | | | | There's several reasons for this change: 1) It reduces the total size of TupleTableSlot / reduces alignment padding, making the commonly accessed members fit into a single cacheline (but we currently do not force proper alignment, so that's not yet guaranteed to be helpful) 2) Combining the booleans into a flag allows to combine read/writes from memory. 3) With the upcoming slot abstraction changes, it allows to have core and extended flags, in a memory efficient way. Author: Ashutosh Bapat and Andres Freund Discussion: https://postgr.es/m/20180220224318.gw4oe5jadhpmcdnm@alap3.anarazel.de
* Change TupleTableSlot->tts_nvalid to type AttrNumber.Andres Freund2018-09-25
| | | | | | | | | | Previously it was an int / 4 bytes. The maximum number of attributes in a tuple is restricted by the maximum value Var->varattno, which is an AttrNumber/int16. Hence use the same data type for TupleTableSlot->tts_nvalid. Author: Ashutosh Bapat Discussion: https://postgr.es/m/20180220224318.gw4oe5jadhpmcdnm@alap3.anarazel.de
* Collect JIT instrumentation from workers.Andres Freund2018-09-25
| | | | | | | | | | | | | | | | Previously, when using parallel query, EXPLAIN (ANALYZE)'s JIT compilation timings did not include the overhead from doing so on the workers. Fix that. We do so by simply aggregating the cost of doing JIT compilation on workers and the leader together. Arguably that's not quite accurate, because the total time spend doing so is spent in parallel - but it's hard to do much better. For additional detail, when VERBOSE is specified, the stats for workers are displayed separately. Author: Amit Khandekar and Andres Freund Discussion: https://postgr.es/m/CAJ3gD9eLrz51RK_gTkod+71iDcjpB_N8eC6vU2AW-VicsAERpQ@mail.gmail.com Backpatch: 11-
* Reset context at the tail end of JITed EEOP_AGG_PLAIN_TRANS.Andres Freund2018-07-22
| | | | | | | | | | While no negative consequences are currently known, it's clearly wrong to not reset the context in one of the branches. Reported-By: Dmitry Dolgov Author: Dmitry Dolgov Discussion: https://postgr.es/m/CAGPqQf165-=+Drw3Voim7M5EjHT1zwPF9BQRjLFQzCzYnNZEiQ@mail.gmail.com Backpatch: 11-, where JIT compilation support was added
* Fix JITed EEOP_AGG_INIT_TRANS, which missed some state.Andres Freund2018-07-22
| | | | | | | | | | | | The JIT compiled implementation missed maintaining AggState->{current_set,curaggcontext}. That could lead to trouble because the transition value could be allocated in the wrong context. Reported-By: Rushabh Lathia Diagnosed-By: Dmitry Dolgov Author: Dmitry Dolgov, with minor changes by me Discussion: https://postgr.es/m/CAGPqQf165-=+Drw3Voim7M5EjHT1zwPF9BQRjLFQzCzYnNZEiQ@mail.gmail.com Backpatch: 11-, where JIT compilation support was added
* Further -Wimplicit-fallthrough cleanup.Andres Freund2018-05-01
| | | | | | | | | | | Tom's earlier commit in 41c912cad159 didn't update a few cases that are only encountered with the non-standard --with-llvm config flag. Additionally there's also one case that appears to be a deficiency in gcc's (up to trunk as of a few days ago) detection of "fallthrough" comments - changing the placement slightly fixes that. Author: Andres Freund Discussion: https://postgr.es/m/20180502003239.wfnqu7ekz7j7imm4@alap3.anarazel.de
* Post-feature-freeze pgindent run.Tom Lane2018-04-26
| | | | Discussion: https://postgr.es/m/15719.1523984266@sss.pgh.pa.us
* Fix a boatload of typos in C comments.Tom Lane2018-04-01
| | | | | | Justin Pryzby Discussion: https://postgr.es/m/20180331105640.GK28454@telsasoft.com
* Correct some typos in the new JIT code.Andres Freund2018-03-26
| | | | Author: Thomas Munro
* JIT tuple deforming in LLVM JIT provider.Andres Freund2018-03-26
| | | | | | | | | | | | | | | | | | | | | Performing JIT compilation for deforming gains performance benefits over unJITed deforming from compile-time knowledge of the tuple descriptor. Fixed column widths, NOT NULLness, etc can be taken advantage of. Right now the JITed deforming is only used when deforming tuples as part of expression evaluation (and obviously only if the descriptor is known). It's likely to be beneficial in other cases, too. By default tuple deforming is JITed whenever an expression is JIT compiled. There's a separate boolean GUC controlling it, but that's expected to be primarily useful for development and benchmarking. Docs will follow in a later commit containing docs for the whole JIT feature. Author: Andres Freund Discussion: https://postgr.es/m/20170901064131.tazjxwus3k2w3ybh@alap3.anarazel.de
* Adapt expression JIT to stdbool.h introduction.Andres Freund2018-03-22
| | | | | | | | | | | | | | | | The LLVM JIT provider uses clang to synchronize types between normal C code and runtime generated code. Clang represents stdbool.h style booleans in return values & parameters differently from booleans stored in variables. Thus the expression compilation code from 2a0faed9d needs to be adapted to 9a95a77d9. Instead of hardcoding i8 as the type for booleans (which already was wrong on some edge case platforms!), use postgres' notion of a boolean as used for storage and for parameters. Per buildfarm animal xenodermus. Author: Andres Freund
* Add expression compilation support to LLVM JIT provider.Andres Freund2018-03-22
In addition to the interpretation of expressions (which back evaluation of WHERE clauses, target list projection, aggregates transition values etc) support compiling expressions to native code, using the infrastructure added in earlier commits. To avoid duplicating a lot of code, only support emitting code for cases that are likely to be performance critical. For expression steps that aren't deemed that, use the existing interpreter. The generated code isn't great - some architectural changes are required to address that. But this already yields a significant speedup for some analytics queries, particularly with WHERE clauses filtering a lot, or computing multiple aggregates. Author: Andres Freund Tested-By: Thomas Munro Discussion: https://postgr.es/m/20170901064131.tazjxwus3k2w3ybh@alap3.anarazel.de Disable JITing for VALUES() nodes. VALUES() nodes are only ever executed once. This is primarily helpful for debugging, when forcing JITing even for cheap queries. Author: Andres Freund Discussion: https://postgr.es/m/20170901064131.tazjxwus3k2w3ybh@alap3.anarazel.de