Ruoqing He 5cb5115456 build: Fix spdk in linux/arm64 image
The reason `test_vfio_user` fails is as @likebreath pointed: our ARM
host does not support SVE, while the nvme_tgt binary built from the
container image requires it. As a result, we encountered a SIGILL when
running the nvme_tgt binary. This also explains why this is not
happening when the container is built on the same host itself.

And quote from @rbradford:

When a job is run on one of the workers it looks to see if there is a
container locally matching the name as specified in the dev_cli.sh
script - if there is then it uses it. Otherwise it will try and download
it from the container registry - if that fails then it will built
locally. For the x86-64 workers started dynamically it will never have a
local version as they are a fresh VM. But on the ARM64 builder is a
local container image cache.

This can lead to an issue where if the image is build with one version
(a handcrafted datestamp) and then the Dockerfile is changed without
changing the timestamp then an old version may be fetched from the cache
or server. It is there for essential to always bump the datestamp (there
is a number after the - that can be used for this.)

However there is also the added complexity that image that is build and
uploaded to the container registry is not the same as the built locally
and thus used for the initial testing of the Dockerfile change. This
leads to the issue we have seen where different CPU compiler flags (from
-march=native) from the QEMU cross build in the hosted GHA action and
the local ARM64 build. Resulting in a binary in the remotely built
container not working locally.

We end up specifying TARGET_ARCHITECTURE="armv8.2-a" for building spdk,
and put built `python/spdk/` folder into `/usr/local/bin/spdk-nvme`.

Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>
2025-02-28 18:34:23 +00:00
..