| VERDICT STATEMENT |
| ~~~~~~~~~~~~~~~~~ |
| The verdict statement alters control flow in the ruleset and issues policy decisions for packets. |
| |
| [verse] |
| {*accept* | *drop* | *queue* | *continue* | *return*} |
| {*jump* | *goto*} 'chain' |
| |
| *accept* and *drop* are absolute verdicts -- they terminate ruleset evaluation immediately. |
| |
| [horizontal] |
| *accept*:: Terminate ruleset evaluation and accept the packet. |
| The packet can still be dropped later by another hook, for instance accept |
| in the forward hook still allows to drop the packet later in the postrouting hook, |
| or another forward base chain that has a higher priority number and is evaluated |
| afterwards in the processing pipeline. |
| *drop*:: Terminate ruleset evaluation and drop the packet. |
| The drop occurs instantly, no further chains or hooks are evaluated. |
| It is not possible to accept the packet in a later chain again, as those |
| are not evaluated anymore for the packet. |
| *queue*:: Terminate ruleset evaluation and queue the packet to userspace. |
| Userspace must provide a drop or accept verdict. In case of accept, processing |
| resumes with the next base chain hook, not the rule following the queue verdict. |
| *continue*:: Continue ruleset evaluation with the next rule. This |
| is the default behaviour in case a rule issues no verdict. |
| *return*:: Return from the current chain and continue evaluation at the |
| next rule in the last chain. If issued in a base chain, it is equivalent to the |
| base chain policy. |
| *jump* 'chain':: Continue evaluation at the first rule in 'chain'. The current |
| position in the ruleset is pushed to a call stack and evaluation will continue |
| there when the new chain is entirely evaluated or a *return* verdict is issued. |
| In case an absolute verdict is issued by a rule in the chain, ruleset evaluation |
| terminates immediately and the specific action is taken. |
| *goto* 'chain':: Similar to *jump*, but the current position is not pushed to the |
| call stack, meaning that after the new chain evaluation will continue at the last |
| chain instead of the one containing the goto statement. |
| |
| .Using verdict statements |
| ------------------- |
| # process packets from eth0 and the internal network in from_lan |
| # chain, drop all packets from eth0 with different source addresses. |
| |
| filter input iif eth0 ip saddr 192.168.0.0/24 jump from_lan |
| filter input iif eth0 drop |
| ------------------- |
| |
| PAYLOAD STATEMENT |
| ~~~~~~~~~~~~~~~~~ |
| [verse] |
| 'payload_expression' *set* 'value' |
| |
| The payload statement alters packet content. It can be used for example to |
| set ip DSCP (diffserv) header field or ipv6 flow labels. |
| |
| .route some packets instead of bridging |
| --------------------------------------- |
| # redirect tcp:http from 192.160.0.0/16 to local machine for routing instead of bridging |
| # assumes 00:11:22:33:44:55 is local MAC address. |
| bridge input meta iif eth0 ip saddr 192.168.0.0/16 tcp dport 80 meta pkttype set unicast ether daddr set 00:11:22:33:44:55 |
| ------------------------------------------- |
| |
| .Set IPv4 DSCP header field |
| --------------------------- |
| ip forward ip dscp set 42 |
| --------------------------- |
| |
| EXTENSION HEADER STATEMENT |
| ~~~~~~~~~~~~~~~~~~~~~~~~~~ |
| [verse] |
| 'extension_header_expression' *set* 'value' |
| |
| The extension header statement alters packet content in variable-sized headers. |
| This can currently be used to alter the TCP Maximum segment size of packets, |
| similar to TCPMSS. |
| |
| .change tcp mss |
| --------------- |
| tcp flags syn tcp option maxseg size set 1360 |
| # set a size based on route information: |
| tcp flags syn tcp option maxseg size set rt mtu |
| --------------- |
| |
| LOG STATEMENT |
| ~~~~~~~~~~~~~ |
| [verse] |
| *log* [*prefix* 'quoted_string'] [*level* 'syslog-level'] [*flags* 'log-flags'] |
| *log* *group* 'nflog_group' [*prefix* 'quoted_string'] [*queue-threshold* 'value'] [*snaplen* 'size'] |
| *log level audit* |
| |
| The log statement enables logging of matching packets. When this statement is |
| used from a rule, the Linux kernel will print some information on all matching |
| packets, such as header fields, via the kernel log (where it can be read with |
| dmesg(1) or read in the syslog). |
| |
| In the second form of invocation (if 'nflog_group' is specified), the Linux |
| kernel will pass the packet to nfnetlink_log which will multicast the packet |
| through a netlink socket to the specified multicast group. One or more userspace |
| processes may subscribe to the group to receive the packets, see |
| libnetfilter_queue documentation for details. |
| |
| In the third form of invocation (if level audit is specified), the Linux |
| kernel writes a message into the audit buffer suitably formatted for reading |
| with auditd. Therefore no further formatting options (such as prefix or flags) |
| are allowed in this mode. |
| |
| This is a non-terminating statement, so the rule evaluation continues after |
| the packet is logged. |
| |
| .log statement options |
| [options="header"] |
| |================== |
| |Keyword | Description | Type |
| |prefix| |
| Log message prefix| |
| quoted string |
| |level| |
| Syslog level of logging | |
| string: emerg, alert, crit, err, warn [default], notice, info, debug, audit |
| |group| |
| NFLOG group to send messages to| |
| unsigned integer (16 bit) |
| |snaplen| |
| Length of packet payload to include in netlink message | |
| unsigned integer (32 bit) |
| |queue-threshold| |
| Number of packets to queue inside the kernel before sending them to userspace | |
| unsigned integer (32 bit) |
| |================================== |
| |
| .log-flags |
| [options="header"] |
| |================== |
| | Flag | Description |
| |tcp sequence| |
| Log TCP sequence numbers. |
| |tcp options| |
| Log options from the TCP packet header. |
| |ip options| |
| Log options from the IP/IPv6 packet header. |
| |skuid| |
| Log the userid of the process which generated the packet. |
| |ether| |
| Decode MAC addresses and protocol. |
| |all| |
| Enable all log flags listed above. |
| |============================== |
| |
| .Using log statement |
| -------------------- |
| # log the UID which generated the packet and ip options |
| ip filter output log flags skuid flags ip options |
| |
| # log the tcp sequence numbers and tcp options from the TCP packet |
| ip filter output log flags tcp sequence,options |
| |
| # enable all supported log flags |
| ip6 filter output log flags all |
| ----------------------- |
| |
| REJECT STATEMENT |
| ~~~~~~~~~~~~~~~~ |
| [verse] |
| ____ |
| *reject* [ *with* 'REJECT_WITH' ] |
| |
| 'REJECT_WITH' := *icmp type* 'icmp_code' | |
| *icmpv6 type* 'icmpv6_code' | |
| *icmpx type* 'icmpx_code' | |
| *tcp reset* |
| ____ |
| |
| A reject statement is used to send back an error packet in response to the |
| matched packet otherwise it is equivalent to drop so it is a terminating |
| statement, ending rule traversal. This statement is only valid in the input, |
| forward and output chains, and user-defined chains which are only called from |
| those chains. |
| |
| .different ICMP reject variants are meant for use in different table families |
| [options="header"] |
| |================== |
| |Variant |Family | Type |
| |icmp| |
| ip| |
| icmp_code |
| |icmpv6| |
| ip6| |
| icmpv6_code |
| |icmpx| |
| inet| |
| icmpx_code |
| |================== |
| |
| For a description of the different types and a list of supported keywords refer |
| to DATA TYPES section above. The common default reject value is |
| *port-unreachable*. + |
| |
| Note that in bridge family, reject statement is only allowed in base chains |
| which hook into input or prerouting. |
| |
| COUNTER STATEMENT |
| ~~~~~~~~~~~~~~~~~ |
| A counter statement sets the hit count of packets along with the number of bytes. |
| |
| [verse] |
| *counter* *packets* 'number' *bytes* 'number' |
| *counter* { *packets* 'number' | *bytes* 'number' } |
| |
| CONNTRACK STATEMENT |
| ~~~~~~~~~~~~~~~~~~~ |
| The conntrack statement can be used to set the conntrack mark and conntrack labels. |
| |
| [verse] |
| *ct* {*mark* | *event* | *label* | *zone*} *set* 'value' |
| |
| The ct statement sets meta data associated with a connection. The zone id |
| has to be assigned before a conntrack lookup takes place, i.e. this has to be |
| done in prerouting and possibly output (if locally generated packets need to be |
| placed in a distinct zone), with a hook priority of -300. |
| |
| .Conntrack statement types |
| [options="header"] |
| |================== |
| |Keyword| Description| Value |
| |event| |
| conntrack event bits | |
| bitmask, integer (32 bit) |
| |helper| |
| name of ct helper object to assign to the connection | |
| quoted string |
| |mark| |
| Connection tracking mark | |
| mark |
| |label| |
| Connection tracking label| |
| label |
| |zone| |
| conntrack zone| |
| integer (16 bit) |
| |================== |
| |
| .save packet nfmark in conntrack |
| -------------------------------- |
| ct mark set meta mark |
| -------------------------------- |
| |
| .set zone mapped via interface |
| ------------------------------ |
| table inet raw { |
| chain prerouting { |
| type filter hook prerouting priority -300; |
| ct zone set iif map { "eth1" : 1, "veth1" : 2 } |
| } |
| chain output { |
| type filter hook output priority -300; |
| ct zone set oif map { "eth1" : 1, "veth1" : 2 } |
| } |
| } |
| ------------------------------------------------------ |
| |
| .restrict events reported by ctnetlink |
| -------------------------------------- |
| ct event set new,related,destroy |
| -------------------------------------- |
| |
| META STATEMENT |
| ~~~~~~~~~~~~~~ |
| A meta statement sets the value of a meta expression. The existing meta fields |
| are: priority, mark, pkttype, nftrace. + |
| |
| [verse] |
| *meta* {*mark* | *priority* | *pkttype* | *nftrace*} *set* 'value' |
| |
| A meta statement sets meta data associated with a packet. + |
| |
| .Meta statement types |
| [options="header"] |
| |================== |
| |Keyword| Description| Value |
| |priority | |
| TC packet priority| |
| tc_handle |
| |mark| |
| Packet mark | |
| mark |
| |pkttype | |
| packet type | |
| pkt_type |
| |nftrace | |
| ruleset packet tracing on/off. Use *monitor trace* command to watch traces| |
| 0, 1 |
| |========================== |
| |
| LIMIT STATEMENT |
| ~~~~~~~~~~~~~~~ |
| [verse] |
| ____ |
| *limit rate* [*over*] 'packet_number' */* 'TIME_UNIT' [*burst* 'packet_number' *packets*] |
| *limit rate* [*over*] 'byte_number' 'BYTE_UNIT' */* 'TIME_UNIT' [*burst* 'byte_number' 'BYTE_UNIT'] |
| |
| 'TIME_UNIT' := *second* | *minute* | *hour* | *day* |
| 'BYTE_UNIT' := *bytes* | *kbytes* | *mbytes* |
| ____ |
| |
| A limit statement matches at a limited rate using a token bucket filter. A rule |
| using this statement will match until this limit is reached. It can be used in |
| combination with the log statement to give limited logging. The optional |
| *over* keyword makes it match over the specified rate. |
| |
| .limit statement values |
| [options="header"] |
| |================== |
| |Value | Description | Type |
| |packet_number | |
| Number of packets | |
| unsigned integer (32 bit) |
| |byte_number | |
| Number of bytes | |
| unsigned integer (32 bit) |
| |======================== |
| |
| NAT STATEMENTS |
| ~~~~~~~~~~~~~~ |
| [verse] |
| ____ |
| *snat to* 'address' [*:*'port'] ['PRF_FLAGS'] |
| *snat to* 'address' *-* 'address' [*:*'port' *-* 'port'] ['PRF_FLAGS'] |
| *snat* { *ip* | *ip6* } *to* 'address' *-* 'address' [*:*'port' *-* 'port'] ['PR_FLAGS'] |
| *dnat to* 'address' [*:*'port'] ['PRF_FLAGS'] |
| *dnat to* 'address' [*:*'port' *-* 'port'] ['PR_FLAGS'] |
| *dnat* { *ip* | *ip6* } *to* 'address' [*:*'port' *-* 'port'] ['PR_FLAGS'] |
| *masquerade to* [*:*'port'] ['PRF_FLAGS'] |
| *masquerade to* [*:*'port' *-* 'port'] ['PRF_FLAGS'] |
| *redirect to* [*:*'port'] ['PRF_FLAGS'] |
| *redirect to* [*:*'port' *-* 'port'] ['PRF_FLAGS'] |
| |
| 'PRF_FLAGS' := 'PRF_FLAG' [*,* 'PRF_FLAGS'] |
| 'PR_FLAGS' := 'PR_FLAG' [*,* 'PR_FLAGS'] |
| 'PRF_FLAG' := 'PR_FLAG' | *fully-random* |
| 'PR_FLAG' := *persistent* | *random* |
| ____ |
| |
| The nat statements are only valid from nat chain types. + |
| |
| The *snat* and *masquerade* statements specify that the source address of the |
| packet should be modified. While *snat* is only valid in the postrouting and |
| input chains, *masquerade* makes sense only in postrouting. The dnat and |
| redirect statements are only valid in the prerouting and output chains, they |
| specify that the destination address of the packet should be modified. You can |
| use non-base chains which are called from base chains of nat chain type too. |
| All future packets in this connection will also be mangled, and rules should |
| cease being examined. |
| |
| The *masquerade* statement is a special form of snat which always uses the |
| outgoing interface's IP address to translate to. It is particularly useful on |
| gateways with dynamic (public) IP addresses. |
| |
| The *redirect* statement is a special form of dnat which always translates the |
| destination address to the local host's one. It comes in handy if one only wants |
| to alter the destination port of incoming traffic on different interfaces. |
| |
| When used in the inet family (available with kernel 5.2), the dnat and snat |
| statements require the use of the ip and ip6 keyword in case an address is |
| provided, see the examples below. |
| |
| Before kernel 4.18 nat statements require both prerouting and postrouting base chains |
| to be present since otherwise packets on the return path won't be seen by |
| netfilter and therefore no reverse translation will take place. |
| |
| .NAT statement values |
| [options="header"] |
| |================== |
| |Expression| Description| Type |
| |address| |
| Specifies that the source/destination address of the packet should be modified. |
| You may specify a mapping to relate a list of tuples composed of arbitrary |
| expression key with address value. | |
| ipv4_addr, ipv6_addr, e.g. abcd::1234, or you can use a mapping, e.g. meta mark map { 10 : 192.168.1.2, 20 : 192.168.1.3 } |
| |port| |
| Specifies that the source/destination address of the packet should be modified. | |
| port number (16 bit) |
| |=============================== |
| |
| .NAT statement flags |
| [options="header"] |
| |================== |
| |Flag| Description |
| |persistent | |
| Gives a client the same source-/destination-address for each connection. |
| |random| |
| In kernel 5.0 and newer this is the same as fully-random. |
| In earlier kernels the port mapping will be randomized using a seeded MD5 |
| hash mix using source and destination address and destination port. |
| |
| |fully-random| |
| If used then port mapping is generated based on a 32-bit pseudo-random algorithm. |
| |============================= |
| |
| .Using NAT statements |
| --------------------- |
| # create a suitable table/chain setup for all further examples |
| add table nat |
| add chain nat prerouting { type nat hook prerouting priority 0; } |
| add chain nat postrouting { type nat hook postrouting priority 100; } |
| |
| # translate source addresses of all packets leaving via eth0 to address 1.2.3.4 |
| add rule nat postrouting oif eth0 snat to 1.2.3.4 |
| |
| # redirect all traffic entering via eth0 to destination address 192.168.1.120 |
| add rule nat prerouting iif eth0 dnat to 192.168.1.120 |
| |
| # translate source addresses of all packets leaving via eth0 to whatever |
| # locally generated packets would use as source to reach the same destination |
| add rule nat postrouting oif eth0 masquerade |
| |
| # redirect incoming TCP traffic for port 22 to port 2222 |
| add rule nat prerouting tcp dport 22 redirect to :2222 |
| |
| # inet family: |
| # handle ip dnat: |
| add rule inet nat prerouting dnat ip to 10.0.2.99 |
| # handle ip6 dnat: |
| add rule inet nat prerouting dnat ip6 to fe80::dead |
| # this masquerades both ipv4 and ipv6: |
| add rule inet nat postrouting meta oif ppp0 masquerade |
| |
| ------------------------ |
| |
| TPROXY STATEMENT |
| ~~~~~~~~~~~~~~~~ |
| Tproxy redirects the packet to a local socket without changing the packet header |
| in any way. If any of the arguments is missing the data of the incoming packet |
| is used as parameter. Tproxy matching requires another rule that ensures the |
| presence of transport protocol header is specified. |
| |
| [verse] |
| *tproxy to* 'address'*:*'port' |
| *tproxy to* {'address' | *:*'port'} |
| |
| This syntax can be used in *ip/ip6* tables where network layer protocol is |
| obvious. Either IP address or port can be specified, but at least one of them is |
| necessary. |
| |
| [verse] |
| *tproxy* {*ip* | *ip6*} *to* 'address'[*:*'port'] |
| *tproxy to :*'port' |
| |
| This syntax can be used in *inet* tables. The *ip/ip6* parameter defines the |
| family the rule will match. The *address* parameter must be of this family. |
| When only *port* is defined, the address family should not be specified. In |
| this case the rule will match for both families. |
| |
| .tproxy attributes |
| [options="header"] |
| |================= |
| | Name | Description |
| | address | IP address the listening socket with IP_TRANSPARENT option is bound to. |
| | port | Port the listening socket with IP_TRANSPARENT option is bound to. |
| |================= |
| |
| .Example ruleset for tproxy statement |
| ------------------------------------- |
| table ip x { |
| chain y { |
| type filter hook prerouting priority -150; policy accept; |
| tcp dport ntp tproxy to 1.1.1.1 |
| udp dport ssh tproxy to :2222 |
| } |
| } |
| table ip6 x { |
| chain y { |
| type filter hook prerouting priority -150; policy accept; |
| tcp dport ntp tproxy to [dead::beef] |
| udp dport ssh tproxy to :2222 |
| } |
| } |
| table inet x { |
| chain y { |
| type filter hook prerouting priority -150; policy accept; |
| tcp dport 321 tproxy to :ssh |
| tcp dport 99 tproxy ip to 1.1.1.1:999 |
| udp dport 155 tproxy ip6 to [dead::beef]:smux |
| } |
| } |
| ------------------------------------- |
| |
| SYNPROXY STATEMENT |
| ~~~~~~~~~~~~~~~~~~ |
| This statement will process TCP three-way-handshake parallel in netfilter |
| context to protect either local or backend system. This statement requires |
| connection tracking because sequence numbers need to be translated. |
| |
| [verse] |
| *synproxy* [*mss* 'mss_value'] [*wscale* 'wscale_value'] ['SYNPROXY_FLAGS'] |
| |
| .synproxy statement attributes |
| [options="header"] |
| |================= |
| | Name | Description |
| | mss | Maximum segment size announced to clients. This must match the backend. |
| | wscale | Window scale announced to clients. This must match the backend. |
| |================= |
| |
| .synproxy statement flags |
| [options="header"] |
| |================= |
| | Flag | Description |
| | sack-perm | |
| Pass client selective acknowledgement option to backend (will be disabled if |
| not present). |
| | timestamp | |
| Pass client timestamp option to backend (will be disabled if not present, also |
| needed for selective acknowledgement and window scaling). |
| |================= |
| |
| .Example ruleset for synproxy statement |
| --------------------------------------- |
| Determine tcp options used by backend, from an external system |
| |
| tcpdump -pni eth0 -c 1 'tcp[tcpflags] == (tcp-syn|tcp-ack)' |
| port 80 & |
| telnet 192.0.2.42 80 |
| 18:57:24.693307 IP 192.0.2.42.80 > 192.0.2.43.48757: |
| Flags [S.], seq 360414582, ack 788841994, win 14480, |
| options [mss 1460,sackOK, |
| TS val 1409056151 ecr 9690221, |
| nop,wscale 9], |
| length 0 |
| |
| Switch tcp_loose mode off, so conntrack will mark out-of-flow packets as state INVALID. |
| |
| echo 0 > /proc/sys/net/netfilter/nf_conntrack_tcp_loose |
| |
| Make SYN packets untracked. |
| |
| table ip x { |
| chain y { |
| type filter hook prerouting priority raw; policy accept; |
| tcp flags syn notrack |
| } |
| } |
| |
| Catch UNTRACKED (SYN packets) and INVALID (3WHS ACK packets) states and send |
| them to SYNPROXY. This rule will respond to SYN packets with SYN+ACK |
| syncookies, create ESTABLISHED for valid client response (3WHS ACK packets) and |
| drop incorrect cookies. Flags combinations not expected during 3WHS will not |
| match and continue (e.g. SYN+FIN, SYN+ACK). Finally, drop invalid packets, this |
| will be out-of-flow packets that were not matched by SYNPROXY. |
| |
| table ip foo { |
| chain z { |
| type filter hook input priority filter; policy accept; |
| ct state { invalid, untracked } synproxy mss 1460 wscale 9 timestamp sack-perm |
| ct state invalid drop |
| } |
| } |
| |
| The outcome ruleset of the steps above should be similar to the one below. |
| |
| table ip x { |
| chain y { |
| type filter hook prerouting priority raw; policy accept; |
| tcp flags syn notrack |
| } |
| |
| chain z { |
| type filter hook input priority filter; policy accept; |
| ct state { invalid, untracked } synproxy mss 1460 wscale 9 timestamp sack-perm |
| ct state invalid drop |
| } |
| } |
| --------------------------------------- |
| |
| FLOW STATEMENT |
| ~~~~~~~~~~~~~~ |
| A flow statement allows us to select what flows you want to accelerate |
| forwarding through layer 3 network stack bypass. You have to specify the |
| flowtable name where you want to offload this flow. |
| |
| *flow add @*'flowtable' |
| |
| QUEUE STATEMENT |
| ~~~~~~~~~~~~~~~ |
| This statement passes the packet to userspace using the nfnetlink_queue handler. |
| The packet is put into the queue identified by its 16-bit queue number. |
| Userspace can inspect and modify the packet if desired. Userspace must then drop |
| or re-inject the packet into the kernel. See libnetfilter_queue documentation |
| for details. |
| |
| [verse] |
| ____ |
| *queue* [*num* 'queue_number'] [*bypass*] |
| *queue* [*num* 'queue_number_from' - 'queue_number_to'] ['QUEUE_FLAGS'] |
| |
| 'QUEUE_FLAGS' := 'QUEUE_FLAG' [*,* 'QUEUE_FLAGS'] |
| 'QUEUE_FLAG' := *bypass* | *fanout* |
| ____ |
| |
| |
| .queue statement values |
| [options="header"] |
| |================== |
| |Value | Description | Type |
| |queue_number | |
| Sets queue number, default is 0. | |
| unsigned integer (16 bit) |
| |queue_number_from | |
| Sets initial queue in the range, if fanout is used. | |
| unsigned integer (16 bit) |
| |queue_number_to | |
| Sets closing queue in the range, if fanout is used. | |
| unsigned integer (16 bit) |
| |===================== |
| |
| .queue statement flags |
| [options="header"] |
| |================== |
| |Flag | Description |
| |bypass | |
| Let packets go through if userspace application cannot back off. Before using |
| this flag, read libnetfilter_queue documentation for performance tuning recommendations. |
| |fanout | |
| Distribute packets between several queues. |
| |=============================== |
| |
| DUP STATEMENT |
| ~~~~~~~~~~~~~ |
| The dup statement is used to duplicate a packet and send the copy to a different |
| destination. |
| |
| [verse] |
| *dup to* 'device' |
| *dup to* 'address' *device* 'device' |
| |
| .Dup statement values |
| [options="header"] |
| |================== |
| |Expression | Description | Type |
| |address | |
| Specifies that the copy of the packet should be sent to a new gateway.| |
| ipv4_addr, ipv6_addr, e.g. abcd::1234, or you can use a mapping, e.g. ip saddr map { 192.168.1.2 : 10.1.1.1 } |
| |device | |
| Specifies that the copy should be transmitted via device. | |
| string |
| |=================== |
| |
| |
| .Using the dup statement |
| ------------------------ |
| # send to machine with ip address 10.2.3.4 on eth0 |
| ip filter forward dup to 10.2.3.4 device "eth0" |
| |
| # copy raw frame to another interface |
| netdetv ingress dup to "eth0" |
| dup to "eth0" |
| |
| # combine with map dst addr to gateways |
| dup to ip daddr map { 192.168.7.1 : "eth0", 192.168.7.2 : "eth1" } |
| ----------------------------------- |
| |
| FWD STATEMENT |
| ~~~~~~~~~~~~~ |
| The fwd statement is used to redirect a raw packet to another interface. It is |
| only available in the netdev family ingress hook. It is similar to the dup |
| statement except that no copy is made. |
| |
| *fwd to* 'device' |
| |
| SET STATEMENT |
| ~~~~~~~~~~~~~ |
| The set statement is used to dynamically add or update elements in a set from |
| the packet path. The set setname must already exist in the given table and must |
| have been created with one or both of the dynamic and the timeout flags. The |
| dynamic flag is required if the set statement expression includes a stateful |
| object. The timeout flag is implied if the set is created with a timeout, and is |
| required if the set statement updates elements, rather than adding them. |
| Furthermore, these sets should specify both a maximum set size (to prevent |
| memory exhaustion), and their elements should have a timeout (so their number |
| will not grow indefinitely) either from the set definition or from the statement |
| that adds or updates them. The set statement can be used to e.g. create dynamic |
| blacklists. |
| |
| [verse] |
| {*add* | *update*} *@*'setname' *{* 'expression' [*timeout* 'timeout'] [*comment* 'string'] *}* |
| |
| .Example for simple blacklist |
| ----------------------------- |
| # declare a set, bound to table "filter", in family "ip". Timeout and size are mandatory because we will add elements from packet path. |
| nft add set ip filter blackhole "{ type ipv4_addr; flags timeout; size 65536; }" |
| |
| # whitelist internal interface. |
| nft add rule ip filter input meta iifname "internal" accept |
| |
| # drop packets coming from blacklisted ip addresses. |
| nft add rule ip filter input ip saddr @blackhole counter drop |
| |
| # add source ip addresses to the blacklist if more than 10 tcp connection requests occurred per second and ip address. |
| # entries will timeout after one minute, after which they might be re-added if limit condition persists. |
| nft add rule ip filter input tcp flags syn tcp dport ssh meter flood size 128000 { ip saddr timeout 10s limit rate over 10/second} add @blackhole { ip saddr timeout 1m } drop |
| |
| # inspect state of the rate limit meter: |
| nft list meter ip filter flood |
| |
| # inspect content of blackhole: |
| nft list set ip filter blackhole |
| |
| # manually add two addresses to the set: |
| nft add element filter blackhole { 10.2.3.4, 10.23.1.42 } |
| ----------------------------------------------- |
| |
| MAP STATEMENT |
| ~~~~~~~~~~~~~ |
| The map statement is used to lookup data based on some specific input key. |
| |
| [verse] |
| ____ |
| 'expression' *map* *{* 'MAP_ELEMENTS' *}* |
| |
| 'MAP_ELEMENTS' := 'MAP_ELEMENT' [*,* 'MAP_ELEMENTS'] |
| 'MAP_ELEMENT' := 'key' *:* 'value' |
| ____ |
| |
| The 'key' is a value returned by 'expression'. |
| |
| |
| .Using the map statement |
| ------------------------ |
| # select DNAT target based on TCP dport: |
| # connections to port 80 are redirected to 192.168.1.100, |
| # connections to port 8888 are redirected to 192.168.1.101 |
| nft add rule ip nat prerouting dnat tcp dport map { 80 : 192.168.1.100, 8888 : 192.168.1.101 } |
| |
| # source address based SNAT: |
| # packets from net 192.168.1.0/24 will appear as originating from 10.0.0.1, |
| # packets from net 192.168.2.0/24 will appear as originating from 10.0.0.2 |
| nft add rule ip nat postrouting snat to ip saddr map { 192.168.1.0/24 : 10.0.0.1, 192.168.2.0/24 : 10.0.0.2 } |
| ------------------------ |
| |
| VMAP STATEMENT |
| ~~~~~~~~~~~~~~ |
| The verdict map (vmap) statement works analogous to the map statement, but |
| contains verdicts as values. |
| |
| [verse] |
| ____ |
| 'expression' *vmap* *{* 'VMAP_ELEMENTS' *}* |
| |
| 'VMAP_ELEMENTS' := 'VMAP_ELEMENT' [*,* 'VMAP_ELEMENTS'] |
| 'VMAP_ELEMENT' := 'key' *:* 'verdict' |
| ____ |
| |
| .Using the vmap statement |
| ------------------------- |
| # jump to different chains depending on layer 4 protocol type: |
| nft add rule ip filter input ip protocol vmap { tcp : jump tcp-chain, udp : jump udp-chain , icmp : jump icmp-chain } |
| ------------------------ |