aboutsummaryrefslogtreecommitdiff
path: root/meson.build
diff options
context:
space:
mode:
authorNathan Bossart <nathan@postgresql.org>2024-11-27 16:19:05 -0600
committerNathan Bossart <nathan@postgresql.org>2024-11-27 16:19:05 -0600
commit4b03a27fafc98e2a34e4e0b5ca44895211e021cc (patch)
treebfa4ffbb9feca3e2e5645dbac3140fab14e7a8ad /meson.build
parent6ba9892f5cb8c2f1c2592198d938cc8f5cf52edc (diff)
downloadpostgresql-4b03a27fafc98e2a34e4e0b5ca44895211e021cc.tar.gz
postgresql-4b03a27fafc98e2a34e4e0b5ca44895211e021cc.zip
Use __attribute__((target(...))) for SSE4.2 CRC-32C support.
Presently, we check for compiler support for the required intrinsics both with and without the -msse4.2 compiler flag, and then depending on the results of those checks, we pick which files to compile with which flags. This is tedious and complicated, and it results in unsustainable coding patterns such as separate files for each portion of code that may need to be built with different compiler flags. This commit makes use of the newly-added support for __attribute__((target(...))) in the SSE4.2 CRC-32C code. This simplifies both the configure-time checks and the build scripts, and it allows us to place the functions that use the intrinsics in files that we otherwise do not want to build with special CPU instructions (although this commit refrains from doing so). This is also preparatory work for a proposed follow-up commit that will further optimize the CRC-32C code with AVX-512 instructions. While at it, this commit modifies meson's checks for SSE4.2 CRC support to be the same as autoconf's. meson was choosing whether to use a runtime check based purely on whether -msse4.2 is required, while autoconf has long checked for the __SSE4_2__ preprocessor symbol to decide. meson's previous approach seems to work just fine, but this change avoids needing to build multiple test programs and to keep track of whether to actually use pg_attribute_target(). Ideally we'd use __attribute__((target(...))) for ARMv8 CRC support, too, but there's little point in doing so because until clang 16, using the ARM intrinsics still requires special compiler flags. Perhaps we can re-evaluate this decision after some time has passed. Author: Raghuveer Devulapalli Discussion: https://postgr.es/m/PH8PR11MB8286BE735A463468415D46B5FB5C2%40PH8PR11MB8286.namprd11.prod.outlook.com
Diffstat (limited to 'meson.build')
-rw-r--r--meson.build22
1 files changed, 15 insertions, 7 deletions
diff --git a/meson.build b/meson.build
index ff3848b1d85..6bc3bb51dc0 100644
--- a/meson.build
+++ b/meson.build
@@ -2211,14 +2211,19 @@ endif
# If we are targeting a processor that has Intel SSE 4.2 instructions, we can
# use the special CRC instructions for calculating CRC-32C. If we're not
# targeting such a processor, but we can nevertheless produce code that uses
-# the SSE intrinsics, perhaps with some extra CFLAGS, compile both
-# implementations and select which one to use at runtime, depending on whether
-# SSE 4.2 is supported by the processor we're running on.
+# the SSE intrinsics, compile both implementations and select which one to use
+# at runtime, depending on whether SSE 4.2 is supported by the processor we're
+# running on.
#
# Similarly, if we are targeting an ARM processor that has the CRC
# instructions that are part of the ARMv8 CRC Extension, use them. And if
# we're not targeting such a processor, but can nevertheless produce code that
# uses the CRC instructions, compile both, and select at runtime.
+#
+# Note that we do not use __attribute__((target("..."))) for the ARM CRC
+# instructions because until clang 16, using the ARM intrinsics still requires
+# special -march flags. Perhaps we can re-evaluate this decision after some
+# time has passed.
###############################################################
have_optimized_crc = false
@@ -2234,6 +2239,9 @@ if host_cpu == 'x86' or host_cpu == 'x86_64'
prog = '''
#include <nmmintrin.h>
+#if defined(__has_attribute) && __has_attribute (target)
+__attribute__((target("sse4.2")))
+#endif
int main(void)
{
unsigned int crc = 0;
@@ -2244,16 +2252,16 @@ int main(void)
}
'''
- if cc.links(prog, name: '_mm_crc32_u8 and _mm_crc32_u32 without -msse4.2',
+ if not cc.links(prog, name: '_mm_crc32_u8 and _mm_crc32_u32',
args: test_c_args)
+ # Do not use Intel SSE 4.2
+ elif (cc.get_define('__SSE4_2__') != '')
# Use Intel SSE 4.2 unconditionally.
cdata.set('USE_SSE42_CRC32C', 1)
have_optimized_crc = true
- elif cc.links(prog, name: '_mm_crc32_u8 and _mm_crc32_u32 with -msse4.2',
- args: test_c_args + ['-msse4.2'])
+ else
# Use Intel SSE 4.2, with runtime check. The CPUID instruction is needed for
# the runtime check.
- cflags_crc += '-msse4.2'
cdata.set('USE_SSE42_CRC32C', false)
cdata.set('USE_SSE42_CRC32C_WITH_RUNTIME_CHECK', 1)
have_optimized_crc = true