Blame examples/tcp_cubic-better-follow-cubic-curve-converted.patch

Packit Service ac8aad
The original patch changes the initialization of 'cubictcp' instance of
Packit Service ac8aad
struct tcp_congestion_ops ('cubictcp.cwnd_event' field). Kpatch
Packit Service ac8aad
intentionally rejects to process such changes.
Packit Service ac8aad
Packit Service ac8aad
This modification of the patch uses Kpatch load/unload hooks to set
Packit Service ac8aad
'cubictcp.cwnd_event' when the binary patch is loaded and reset it to NULL
Packit Service ac8aad
when the patch is unloaded.
Packit Service ac8aad
Packit Service ac8aad
It is still needed to check if changing that field could be problematic
Packit Service ac8aad
due to concurrency issues, etc.
Packit Service ac8aad
Packit Service ac8aad
'cwnd_event' callback is used only via tcp_ca_event() function.
Packit Service ac8aad
Packit Service ac8aad
include/net/tcp.h:
Packit Service ac8aad
Packit Service ac8aad
static inline void tcp_ca_event(struct sock *sk, const enum tcp_ca_event event)
Packit Service ac8aad
{
Packit Service ac8aad
	const struct inet_connection_sock *icsk = inet_csk(sk);
Packit Service ac8aad
Packit Service ac8aad
	if (icsk->icsk_ca_ops->cwnd_event)
Packit Service ac8aad
		icsk->icsk_ca_ops->cwnd_event(sk, event);
Packit Service ac8aad
}
Packit Service ac8aad
Packit Service ac8aad
In turn, tcp_ca_event() is called in a number of places in
Packit Service ac8aad
net/ipv4/tcp_output.c and net/ipv4/tcp_input.c.
Packit Service ac8aad
Packit Service ac8aad
One problem with this modification of the patch is that it may not be safe
Packit Service ac8aad
to unload it. If it is possible for tcp_ca_event() to run concurrently with
Packit Service ac8aad
the unloading of the patch, it may happen that 'icsk->icsk_ca_ops->cwnd_event'
Packit Service ac8aad
is the address of bictcp_cwnd_event() when tcp_ca_event() checks it but is
Packit Service ac8aad
set to NULL right after. So 'icsk->icsk_ca_ops->cwnd_event(sk, event)' would
Packit Service ac8aad
result in a kernel oops.
Packit Service ac8aad
Packit Service ac8aad
Whether such scenario is possible or not, it should be analyzed. If it is,
Packit Service ac8aad
then at least, the body of tcp_ca_event() should be made atomic w.r.t.
Packit Service ac8aad
changing 'cwnd_event' in the patch somehow. Perhaps, RCU could be suitable
Packit Service ac8aad
for that: a read-side critical section for the body of tcp_ca_event() with
Packit Service ac8aad
a single read of icsk->icsk_ca_ops->cwnd_event pointer with rcu_dereference().
Packit Service ac8aad
The pointer could be set by the patch with rcu_assign_pointer().
Packit Service ac8aad
Packit Service ac8aad
An alternative suggested by Josh Poimboeuf would be to patch the functions
Packit Service ac8aad
that call 'cwnd_event' callback (tcp_ca_event() in this case) so that they
Packit Service ac8aad
call bictcp_cwnd_event() directly when they detect the cubictcp struct [1].
Packit Service ac8aad
Packit Service ac8aad
Note that tcp_ca_event() is inlined in a number of places, so the binary
Packit Service ac8aad
patch will provide replacements for all of the corresponding functions
Packit Service ac8aad
rather than for just one. It is still needed to check if replacing these
Packit Service ac8aad
functions in runtime is safe.
Packit Service ac8aad
Packit Service ac8aad
References:
Packit Service ac8aad
[1] https://www.redhat.com/archives/kpatch/2015-September/msg00005.html
Packit Service ac8aad
Packit Service ac8aad
diff --git a/net/ipv4/tcp_cubic.c b/net/ipv4/tcp_cubic.c
Packit Service ac8aad
index 894b7ce..9bff8a0 100644
Packit Service ac8aad
--- a/net/ipv4/tcp_cubic.c
Packit Service ac8aad
+++ b/net/ipv4/tcp_cubic.c
Packit Service ac8aad
@@ -153,6 +153,27 @@ static void bictcp_init(struct sock *sk)
Packit Service ac8aad
 		tcp_sk(sk)->snd_ssthresh = initial_ssthresh;
Packit Service ac8aad
 }
Packit Service ac8aad
Packit Service ac8aad
+static void bictcp_cwnd_event(struct sock *sk, enum tcp_ca_event event)
Packit Service ac8aad
+{
Packit Service ac8aad
+	if (event == CA_EVENT_TX_START) {
Packit Service ac8aad
+		struct bictcp *ca = inet_csk_ca(sk);
Packit Service ac8aad
+		u32 now = tcp_time_stamp;
Packit Service ac8aad
+		s32 delta;
Packit Service ac8aad
+
Packit Service ac8aad
+		delta = now - tcp_sk(sk)->lsndtime;
Packit Service ac8aad
+
Packit Service ac8aad
+		/* We were application limited (idle) for a while.
Packit Service ac8aad
+		 * Shift epoch_start to keep cwnd growth to cubic curve.
Packit Service ac8aad
+		 */
Packit Service ac8aad
+		if (ca->epoch_start && delta > 0) {
Packit Service ac8aad
+			ca->epoch_start += delta;
Packit Service ac8aad
+			if (after(ca->epoch_start, now))
Packit Service ac8aad
+				ca->epoch_start = now;
Packit Service ac8aad
+		}
Packit Service ac8aad
+		return;
Packit Service ac8aad
+	}
Packit Service ac8aad
+}
Packit Service ac8aad
+
Packit Service ac8aad
 /* calculate the cubic root of x using a table lookup followed by one
Packit Service ac8aad
  * Newton-Raphson iteration.
Packit Service ac8aad
  * Avg err ~= 0.195%
Packit Service ac8aad
@@ -444,6 +465,20 @@ static struct tcp_congestion_ops cubictcp __read_mostly = {
Packit Service ac8aad
 	.name		= "cubic",
Packit Service ac8aad
 };
Packit Service ac8aad
Packit Service ac8aad
+void kpatch_load_cubictcp_cwnd_event(void)
Packit Service ac8aad
+{
Packit Service ac8aad
+	cubictcp.cwnd_event = bictcp_cwnd_event;
Packit Service ac8aad
+}
Packit Service ac8aad
+
Packit Service ac8aad
+void kpatch_unload_cubictcp_cwnd_event(void)
Packit Service ac8aad
+{
Packit Service ac8aad
+	cubictcp.cwnd_event = NULL;
Packit Service ac8aad
+}
Packit Service ac8aad
+
Packit Service ac8aad
+#include "kpatch-macros.h"
Packit Service ac8aad
+KPATCH_LOAD_HOOK(kpatch_load_cubictcp_cwnd_event);
Packit Service ac8aad
+KPATCH_UNLOAD_HOOK(kpatch_unload_cubictcp_cwnd_event);
Packit Service ac8aad
+
Packit Service ac8aad
 static int __init cubictcp_register(void)
Packit Service ac8aad
 {
Packit Service ac8aad
 	BUILD_BUG_ON(sizeof(struct bictcp) > ICSK_CA_PRIV_SIZE);