tcp: Remove compile-time dependency on struct tcp_info version
In the Makefile we probe to create several defines based on the presence
of particular fields in struct tcp_info. These defines are used for two
purposes, neither of which they accomplish well:
1) Determining if the tcp_info fields are available at runtime. For this
purpose the defines are Just Plain Wrong, since the runtime kernel may
not be the same as the compile time kernel. We corrected this for
tcp_snd_wnd, but not for tcpi_bytes_acked or tcpi_min_rtt
2) Allowing the source to compile against older kernel headers which don't
have the fields in question. This works in theory, but it does mean
we won't be able to use the fields, even if later run against a
newer kernel. Furthermore, it's quite fragile: without much more
thorough tests of builds in different environments that we're currently
set up for, it's very easy to miss cases where we're accessing a field
without protection from an #ifdef. For example we currently access
tcpi_snd_wnd without #ifdefs in tcp_update_seqack_wnd().
Improve this with a different approach, borrowed from qemu (which has many
instances of similar problems). Don't compile against linux/tcp.h, using
netinet/tcp.h instead. Then for when we need an extension field, define
a struct tcp_info_linux, copied from the kernel, with all the fields we're
interested in. That may need updating from future kernel versions, but
only when we want to use a new extension, so it shouldn't be frequent.
This allows us to remove the HAS_SND_WND define entirely. We keep
HAS_BYTES_ACKED and HAS_MIN_RTT now, since they're used for purpose (1),
we'll fix that in a later patch.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
[sbrivio: Trivial grammar fixes in comments]
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2024-10-24 04:59:20 +00:00
|
|
|
/* SPDX-License-Identifier: GPL-2.0-or-later
|
|
|
|
* Copyright Red Hat
|
|
|
|
*
|
2024-11-06 06:54:14 +00:00
|
|
|
* Declarations for Linux specific dependencies
|
tcp: Remove compile-time dependency on struct tcp_info version
In the Makefile we probe to create several defines based on the presence
of particular fields in struct tcp_info. These defines are used for two
purposes, neither of which they accomplish well:
1) Determining if the tcp_info fields are available at runtime. For this
purpose the defines are Just Plain Wrong, since the runtime kernel may
not be the same as the compile time kernel. We corrected this for
tcp_snd_wnd, but not for tcpi_bytes_acked or tcpi_min_rtt
2) Allowing the source to compile against older kernel headers which don't
have the fields in question. This works in theory, but it does mean
we won't be able to use the fields, even if later run against a
newer kernel. Furthermore, it's quite fragile: without much more
thorough tests of builds in different environments that we're currently
set up for, it's very easy to miss cases where we're accessing a field
without protection from an #ifdef. For example we currently access
tcpi_snd_wnd without #ifdefs in tcp_update_seqack_wnd().
Improve this with a different approach, borrowed from qemu (which has many
instances of similar problems). Don't compile against linux/tcp.h, using
netinet/tcp.h instead. Then for when we need an extension field, define
a struct tcp_info_linux, copied from the kernel, with all the fields we're
interested in. That may need updating from future kernel versions, but
only when we want to use a new extension, so it shouldn't be frequent.
This allows us to remove the HAS_SND_WND define entirely. We keep
HAS_BYTES_ACKED and HAS_MIN_RTT now, since they're used for purpose (1),
we'll fix that in a later patch.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
[sbrivio: Trivial grammar fixes in comments]
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2024-10-24 04:59:20 +00:00
|
|
|
*/
|
|
|
|
|
2024-11-06 06:54:14 +00:00
|
|
|
#ifndef LINUX_DEP_H
|
|
|
|
#define LINUX_DEP_H
|
tcp: Remove compile-time dependency on struct tcp_info version
In the Makefile we probe to create several defines based on the presence
of particular fields in struct tcp_info. These defines are used for two
purposes, neither of which they accomplish well:
1) Determining if the tcp_info fields are available at runtime. For this
purpose the defines are Just Plain Wrong, since the runtime kernel may
not be the same as the compile time kernel. We corrected this for
tcp_snd_wnd, but not for tcpi_bytes_acked or tcpi_min_rtt
2) Allowing the source to compile against older kernel headers which don't
have the fields in question. This works in theory, but it does mean
we won't be able to use the fields, even if later run against a
newer kernel. Furthermore, it's quite fragile: without much more
thorough tests of builds in different environments that we're currently
set up for, it's very easy to miss cases where we're accessing a field
without protection from an #ifdef. For example we currently access
tcpi_snd_wnd without #ifdefs in tcp_update_seqack_wnd().
Improve this with a different approach, borrowed from qemu (which has many
instances of similar problems). Don't compile against linux/tcp.h, using
netinet/tcp.h instead. Then for when we need an extension field, define
a struct tcp_info_linux, copied from the kernel, with all the fields we're
interested in. That may need updating from future kernel versions, but
only when we want to use a new extension, so it shouldn't be frequent.
This allows us to remove the HAS_SND_WND define entirely. We keep
HAS_BYTES_ACKED and HAS_MIN_RTT now, since they're used for purpose (1),
we'll fix that in a later patch.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
[sbrivio: Trivial grammar fixes in comments]
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2024-10-24 04:59:20 +00:00
|
|
|
|
|
|
|
/* struct tcp_info_linux - Information from Linux TCP_INFO getsockopt()
|
2024-11-06 06:54:14 +00:00
|
|
|
*
|
|
|
|
* Largely derived from include/linux/tcp.h in the Linux kernel
|
tcp: Remove compile-time dependency on struct tcp_info version
In the Makefile we probe to create several defines based on the presence
of particular fields in struct tcp_info. These defines are used for two
purposes, neither of which they accomplish well:
1) Determining if the tcp_info fields are available at runtime. For this
purpose the defines are Just Plain Wrong, since the runtime kernel may
not be the same as the compile time kernel. We corrected this for
tcp_snd_wnd, but not for tcpi_bytes_acked or tcpi_min_rtt
2) Allowing the source to compile against older kernel headers which don't
have the fields in question. This works in theory, but it does mean
we won't be able to use the fields, even if later run against a
newer kernel. Furthermore, it's quite fragile: without much more
thorough tests of builds in different environments that we're currently
set up for, it's very easy to miss cases where we're accessing a field
without protection from an #ifdef. For example we currently access
tcpi_snd_wnd without #ifdefs in tcp_update_seqack_wnd().
Improve this with a different approach, borrowed from qemu (which has many
instances of similar problems). Don't compile against linux/tcp.h, using
netinet/tcp.h instead. Then for when we need an extension field, define
a struct tcp_info_linux, copied from the kernel, with all the fields we're
interested in. That may need updating from future kernel versions, but
only when we want to use a new extension, so it shouldn't be frequent.
This allows us to remove the HAS_SND_WND define entirely. We keep
HAS_BYTES_ACKED and HAS_MIN_RTT now, since they're used for purpose (1),
we'll fix that in a later patch.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
[sbrivio: Trivial grammar fixes in comments]
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2024-10-24 04:59:20 +00:00
|
|
|
*
|
|
|
|
* Some fields returned by TCP_INFO have been there for ages and are shared with
|
|
|
|
* BSD. struct tcp_info from netinet/tcp.h has only those fields. There are
|
|
|
|
* also a many Linux specific extensions to the structure, which are only found
|
|
|
|
* in the linux/tcp.h version of struct tcp_info.
|
|
|
|
*
|
|
|
|
* We want to use some of those extension fields, when available. We can test
|
|
|
|
* for availability in the runtime kernel using the length returned from
|
|
|
|
* getsockopt(). However, we won't necessarily be compiled against the same
|
|
|
|
* kernel headers as we'll run with, so compiling directly against linux/tcp.h
|
|
|
|
* means wrapping every field access in an #ifdef whose #else does the same
|
|
|
|
* thing as when the field is missing at runtime. This rapidly gets messy.
|
|
|
|
*
|
|
|
|
* Instead we define here struct tcp_info_linux which includes all the Linux
|
|
|
|
* extensions that we want to use. This is taken from v6.11 of the kernel.
|
|
|
|
*/
|
|
|
|
struct tcp_info_linux {
|
|
|
|
uint8_t tcpi_state;
|
|
|
|
uint8_t tcpi_ca_state;
|
|
|
|
uint8_t tcpi_retransmits;
|
|
|
|
uint8_t tcpi_probes;
|
|
|
|
uint8_t tcpi_backoff;
|
|
|
|
uint8_t tcpi_options;
|
|
|
|
uint8_t tcpi_snd_wscale : 4, tcpi_rcv_wscale : 4;
|
|
|
|
uint8_t tcpi_delivery_rate_app_limited:1, tcpi_fastopen_client_fail:2;
|
|
|
|
|
|
|
|
uint32_t tcpi_rto;
|
|
|
|
uint32_t tcpi_ato;
|
|
|
|
uint32_t tcpi_snd_mss;
|
|
|
|
uint32_t tcpi_rcv_mss;
|
|
|
|
|
|
|
|
uint32_t tcpi_unacked;
|
|
|
|
uint32_t tcpi_sacked;
|
|
|
|
uint32_t tcpi_lost;
|
|
|
|
uint32_t tcpi_retrans;
|
|
|
|
uint32_t tcpi_fackets;
|
|
|
|
|
|
|
|
/* Times. */
|
|
|
|
uint32_t tcpi_last_data_sent;
|
|
|
|
uint32_t tcpi_last_ack_sent;
|
|
|
|
uint32_t tcpi_last_data_recv;
|
|
|
|
uint32_t tcpi_last_ack_recv;
|
|
|
|
|
|
|
|
/* Metrics. */
|
|
|
|
uint32_t tcpi_pmtu;
|
|
|
|
uint32_t tcpi_rcv_ssthresh;
|
|
|
|
uint32_t tcpi_rtt;
|
|
|
|
uint32_t tcpi_rttvar;
|
|
|
|
uint32_t tcpi_snd_ssthresh;
|
|
|
|
uint32_t tcpi_snd_cwnd;
|
|
|
|
uint32_t tcpi_advmss;
|
|
|
|
uint32_t tcpi_reordering;
|
|
|
|
|
|
|
|
uint32_t tcpi_rcv_rtt;
|
|
|
|
uint32_t tcpi_rcv_space;
|
|
|
|
|
|
|
|
uint32_t tcpi_total_retrans;
|
|
|
|
|
|
|
|
/* Linux extensions */
|
|
|
|
uint64_t tcpi_pacing_rate;
|
|
|
|
uint64_t tcpi_max_pacing_rate;
|
|
|
|
uint64_t tcpi_bytes_acked; /* RFC4898 tcpEStatsAppHCThruOctetsAcked */
|
|
|
|
uint64_t tcpi_bytes_received; /* RFC4898 tcpEStatsAppHCThruOctetsReceived */
|
|
|
|
uint32_t tcpi_segs_out; /* RFC4898 tcpEStatsPerfSegsOut */
|
|
|
|
uint32_t tcpi_segs_in; /* RFC4898 tcpEStatsPerfSegsIn */
|
|
|
|
|
|
|
|
uint32_t tcpi_notsent_bytes;
|
|
|
|
uint32_t tcpi_min_rtt;
|
|
|
|
uint32_t tcpi_data_segs_in; /* RFC4898 tcpEStatsDataSegsIn */
|
|
|
|
uint32_t tcpi_data_segs_out; /* RFC4898 tcpEStatsDataSegsOut */
|
|
|
|
|
|
|
|
uint64_t tcpi_delivery_rate;
|
|
|
|
|
|
|
|
uint64_t tcpi_busy_time; /* Time (usec) busy sending data */
|
|
|
|
uint64_t tcpi_rwnd_limited; /* Time (usec) limited by receive window */
|
|
|
|
uint64_t tcpi_sndbuf_limited; /* Time (usec) limited by send buffer */
|
|
|
|
|
|
|
|
uint32_t tcpi_delivered;
|
|
|
|
uint32_t tcpi_delivered_ce;
|
|
|
|
|
|
|
|
uint64_t tcpi_bytes_sent; /* RFC4898 tcpEStatsPerfHCDataOctetsOut */
|
|
|
|
uint64_t tcpi_bytes_retrans; /* RFC4898 tcpEStatsPerfOctetsRetrans */
|
|
|
|
uint32_t tcpi_dsack_dups; /* RFC4898 tcpEStatsStackDSACKDups */
|
|
|
|
uint32_t tcpi_reord_seen; /* reordering events seen */
|
|
|
|
|
|
|
|
uint32_t tcpi_rcv_ooopack; /* Out-of-order packets received */
|
|
|
|
|
|
|
|
uint32_t tcpi_snd_wnd; /* peer's advertised receive window after
|
|
|
|
* scaling (bytes)
|
|
|
|
*/
|
|
|
|
uint32_t tcpi_rcv_wnd; /* local advertised receive window after
|
|
|
|
* scaling (bytes)
|
|
|
|
*/
|
|
|
|
|
|
|
|
uint32_t tcpi_rehash; /* PLB or timeout triggered rehash attempts */
|
|
|
|
|
|
|
|
uint16_t tcpi_total_rto; /* Total number of RTO timeouts, including
|
|
|
|
* SYN/SYN-ACK and recurring timeouts.
|
|
|
|
*/
|
|
|
|
uint16_t tcpi_total_rto_recoveries; /* Total number of RTO
|
|
|
|
* recoveries, including any
|
|
|
|
* unfinished recovery.
|
|
|
|
*/
|
|
|
|
uint32_t tcpi_total_rto_time; /* Total time spent in RTO recoveries
|
|
|
|
* in milliseconds, including any
|
|
|
|
* unfinished recovery.
|
|
|
|
*/
|
|
|
|
};
|
|
|
|
|
2024-11-08 02:53:27 +00:00
|
|
|
#include <linux/falloc.h>
|
|
|
|
|
|
|
|
#ifndef FALLOC_FL_COLLAPSE_RANGE
|
|
|
|
#define FALLOC_FL_COLLAPSE_RANGE 0x08
|
|
|
|
#endif
|
|
|
|
|
2024-11-08 02:53:28 +00:00
|
|
|
#include <linux/close_range.h>
|
|
|
|
|
|
|
|
/* glibc < 2.34 and musl as of 1.2.5 need these */
|
|
|
|
#ifndef SYS_close_range
|
|
|
|
#define SYS_close_range 436
|
|
|
|
#endif
|
2024-11-08 02:53:29 +00:00
|
|
|
#ifndef CLOSE_RANGE_UNSHARE /* Linux kernel < 5.9 */
|
|
|
|
#define CLOSE_RANGE_UNSHARE (1U << 1)
|
|
|
|
#endif
|
|
|
|
|
2024-11-08 02:53:28 +00:00
|
|
|
__attribute__ ((weak))
|
|
|
|
/* cppcheck-suppress funcArgNamesDifferent */
|
|
|
|
int close_range(unsigned int first, unsigned int last, int flags) {
|
|
|
|
return syscall(SYS_close_range, first, last, flags);
|
|
|
|
}
|
|
|
|
|
2024-11-06 06:54:14 +00:00
|
|
|
#endif /* LINUX_DEP_H */
|