Keyboard shortcuts

Press ← or β†’ to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Welcome!

LinkedIn GitHub Itch.io

Projects and articles found here!

Overview

diode - modifying a Β£10 switch to act as a firmware data diode

switch port_isolation

autospy - a test spy object library for Rust

#![allow(unused)]
fn main() {
#[cfg_attr(test, autospy::autospy)]
trait MyTrait {
    fn foo(&self, x: u32) -> bool;
}

fn use_trait(trait_object: &impl MyTrait) -> bool {
    trait_object.foo(10)
}

#[cfg(test)]
mod tests {
    use super::*;

    #[test]
    fn test_trait() {
        let spy = MyTraitSpy::default(); // build spy
        
        spy.foo.returns.set([true]); // set the return values

        assert!(use_trait(&spy)); // use the spy
        assert_eq!([10], spy.foo.arguments) // check the captured arguments
    }
}
}

licenses - cargo subcommand for collecting, summarising and checking licenses

$ cargo licenses --help
Usage: cargo licenses [OPTIONS] <COMMAND>

Commands:
  collect  Collects all licenses into a folder
  summary  Provides a summary of all licenses
  check    Checks all licenses for inconsistencies
  diff     Diff between the current licenses folder and the licenses that would be collected

Options:
  -d, --dev                  Include dev dependencies [default: excluded]
  -b, --build                Include build dependencies [default: excluded]
  -D, --depth <DEPTH>        The depth of dependencies to include [default: all sub dependencies]
  -e, --exclude <WORKSPACE>  Exclude specified workspace [default: all included]
  -i, --ignore <CRATE>       Ignore specified crate [default: all included]
  -c, --config <PATH>        Path to configuration file
  -h, --help                 Print help

w5500-evb-pico-json - protocol break relay for valid JSON on the W5500-EVB-Pico

W5500-ECB-Pico

trust-list - command line tool for generating a markdown dependency information table

namedownloadscontributorsreverse_dependenciesversionscreated_atupdated_atrepository
anyhow455074655242455810205/10/201919/09/2025https://github.com/dtolnay/anyhow
chrono39363107930+174919220/11/201408/09/2025https://github.com/chronotope/chrono
clap56455292130+2592644401/03/201529/10/2025https://github.com/clap-rs/clap
field_names55654813308/01/202104/01/2022https://github.com/TedDriggs/field_names
itertools70139948030+703813021/11/201431/12/2024https://github.com/rust-itertools/itertools
pbr2835208261052414/10/201508/02/2023https://github.com/a8m/pb
reqwest30766343930+1461211516/10/201613/10/2025https://github.com/seanmonstar/reqwest
serde70166718930+5954431505/12/201427/09/2025https://github.com/serde-rs/serde
serde_json61622793030+4196017707/08/201514/09/2025https://github.com/serde-rs/json

redacta - command line tool for redacting information from text

$ echo "Look at my 192.168.0.1 IP!" | redacta --ipv4
Look at my *********** IP!

autospy

🎡 autospy record, autospy replace 🎡

Crates.io Version docs.rs GitHub Actions Workflow Status MIT

A test spy object library.

Overview

A test spy is a type of test double used in unit testing. It provides the same interface as the production code, but allows you to set outputs before use in a test and to verify input parameters after the spy has been used.

#[autospy] generates a test spy object for traits.

Usage

The example below demonstrates use in a unit test assuming autospy is included in [dev-dependencies].

#![allow(unused)]
fn main() {
#[cfg_attr(test, autospy::autospy)]
trait MyTrait {
    fn foo(&self, x: u32) -> bool;
}

fn use_trait(trait_object: &impl MyTrait) -> bool {
    trait_object.foo(10)
}

#[cfg(test)]
mod tests {
    use super::*;

    #[test]
    fn test_trait() {
        let spy = MyTraitSpy::default(); // build spy
        
        spy.foo.returns.set([true]); // set the return values

        assert!(use_trait(&spy)); // use the spy
        assert_eq!([10], spy.foo.arguments) // check the captured arguments
    }
}
}

For additional examples and features see the docs.

Acknowledgements

Autospy is heavily influenced by the excellent mockall crate, which, through automock, provides many similar features.

Autospy aims to offer these features through a macro-generated spy object, rather than a mock object. The use of either is largely personal preference; however, there are some advantages to using a spy object:

Test objectTest failuresTest structureComplexity
MockPanics if expectations fail; error messages can be unclearLess standard pattern, expectations are baked into objectMore crate-specific syntax and usage patterns
SpyAsserts like any regular testAssert after use, more standard test patternSimple: set what's returned, then inspect what was called

licenses

crates.io GitHub Actions Workflow Status MIT

Cargo subcommand for collecting licenses.

Install

$ cargo install licenses

Usage

$ cargo licenses --help
Usage: cargo licenses [OPTIONS] <COMMAND>

Commands:
  collect  Collects all licenses into a folder
  summary  Provides a summary of all licenses
  check    Checks all licenses for inconsistencies
  diff     Diff between the current licenses folder and the licenses that would be collected

Options:
  -d, --dev                  Include dev dependencies [default: excluded]
  -b, --build                Include build dependencies [default: excluded]
  -D, --depth <DEPTH>        The depth of dependencies to include [default: all sub dependencies]
  -e, --exclude <WORKSPACE>  Exclude specified workspace [default: all included]
  -i, --ignore <CRATE>       Ignore specified crate [default: all included]
  -c, --config <PATH>        Path to configuration file
  -h, --help                 Print help

Commands

Collect

Collects all licenses into a folder.

The output folder path can be specified with --path, defaults to licenses.

Prints a warning:

  • If the crate had no declared license on crates.io (none declared)
  • If no licenses were found for a crate (empty)
  • If there were fewer licenses found for a crate than declared by the author on crates.io (too few)
  • If there were more licenses found for a crate than declared by the author on crates.io (additional)
  • If the content of the found licenses did not match the expected content for those licenses (mismatch)
$ cargo licenses collect --depth 1
licenses
β”œβ”€β”€ anyhow-LICENSE-APACHE
β”œβ”€β”€ anyhow-LICENSE-MIT
β”œβ”€β”€ cargo_metadata-LICENSE-MIT
β”œβ”€β”€ clap-LICENSE-APACHE
β”œβ”€β”€ clap-LICENSE-MIT
β”œβ”€β”€ colored-LICENSE
β”œβ”€β”€ indicatif-LICENSE
β”œβ”€β”€ itertools-LICENSE-APACHE
β”œβ”€β”€ itertools-LICENSE-MIT
β”œβ”€β”€ serde-LICENSE-APACHE
β”œβ”€β”€ serde-LICENSE-MIT
β”œβ”€β”€ serde_json-LICENSE-APACHE
β”œβ”€β”€ serde_json-LICENSE-MIT
β”œβ”€β”€ spdx-LICENSE-APACHE
β”œβ”€β”€ spdx-LICENSE-MIT
β”œβ”€β”€ strsim-LICENSE
β”œβ”€β”€ toml-LICENSE-APACHE
└── toml-LICENSE-MIT

Summary

Summarises the declared licenses.

The declared license is what the author declares the license as on crates.io, it is not necessarily the same as the actual licenses. The warnings generated by the collect and check commands will highlight discrepancies between the declared licenses and the actual licenses.

The summary can be formatted as JSON or TOML with --json or --toml respectively.

$ cargo licenses summary --depth 1
MIT - cargo_metadata,indicatif,strsim
MIT OR Apache-2.0 - anyhow,clap,itertools,serde,serde_json,spdx,toml
MPL-2.0 - colored

Check

Checks all licenses for inconsistencies.

Returns a non-zero exit code:

  • If the crate had no declared license on crates.io (none declared)
  • If no licenses were found for a crate (empty)
  • If there were fewer licenses found for a crate than declared by the author on crates.io (too few)
  • If there were more licenses found for a crate than declared by the author on crates.io (additional)
  • If the content of the found licenses did not match the expected content for those licenses (mismatch)
$ cargo licenses check
warning: additional - found all declared licenses, but found additional licenses for:
        memchr - COPYING
        unicode_xid - COPYRIGHT
        utf8_iter - COPYRIGHT
warning: mismatch - found license(s) whose content was not similar to declared licenses for:
        portable_atomic - LICENSE-APACHE

Diff

Compares the current collected licenses folder against the licenses that would be collected.

Current licenses folder path can be specified with --path, defaults to licenses.

Returns a non-zero exit code if there is a difference between the licenses that would be collected and the current collected licenses folder.

$ cargo licenses diff

Configuration

A TOML configuration file can be used to store all passed flags, as well as enabling options on a per-crate basis. If both a config and a flag set the same option, the flag will take precedence.

$ cargo licenses <COMMAND> --config licenses.toml

Skipping licenses

The configuration file allows the selective skipping of licenses found by the various subcommands. It is recommended to provide a comment per skipped license to indicate why it is deemed okay to skip, for instance it might be erroneously detected as a license because of the filename.

[crates]
example_crate = { skip = ["FILE"] } # comment on why the files are skipped

Allowing warnings

Warnings generated by the collect or check command can be allowed in the configuration file, this allows erroneous warnings to be selectively silenced. It is recommended to provide a comment on why a warning is being allowed. The warnings that can be allowed are included in the warning message, these are:

  • too few
  • empty
  • none declared
  • { additional = ["file1", "file2"] }
  • { mismatch = ["file1", "file2"] }
[crates]
example_crate = { allow = "too few" }

Include licenses

Additional licenses can be included for a specific crate via the configuration file.

[crates]
example_crate = { include = [{ name = "LICENSE", text = "custom license text" }] }

Example

The below is an example of a TOML configuration file that could be used via the --config flag.

[global]
dev = true
build = true
depth = 1
exclude = ["workspace"]
ignore = ["crate"]

[crates]
crate_one = { skip = ["COPYING"] } # not a license, statement of which licenses the crate falls under
crate_two = { allow = { mismatch = ["LICENSE"] } } # erroneous license content mismatch
crate_three = { allow = "too few" } # only one license provided

Usage patterns

This tool is designed to help collect required licenses when shipping software with open-source dependencies. The intended pattern of use would look as follows:

  • summary provides a quick way to see if any dependencies are using stricter licenses that might not be suitable, copy-left for instance
  • collect to collect all licenses into an output folder, this would be done manually and the license folder commited as part of the repository
  • the previous command might have raised warnings about licenses found, or not found, these can be manually assessed then skipped or allowed in the configuration file
  • as part of a continuous integration system, or as a pre-commit hook, a diff should be run to check the licenses folder hasn't missed any licenses added by new dependencies or removed by removing dependencies
  • as part of a continuous integration system a check should be run to confirm all license inconsistencies have been handled in the configuration

This is provided as a convenience to help with collecting and reviewing open-source licenses. It does not guarantee compliance with all legal licensing requirements. It is the user's responsibility to ensure that all applicable licenses are collected, reviewed and adhered to. The authors and contributors of this tool accept no liability for missing, incomplete or inaccurate licenses files, or for any consequences arising from its use.

w5500-evb-pico-json

GitHub Actions Workflow Status MIT

Protocol break relay for valid JSON on the W5500-EVB-Pico.

W5500-ECB-Pico

Overview

Uses the WIZnet W5500 on the W5500-EVB-Pico in MACRAW mode to pass raw packets to the RP2040. A protocol break is implemented by throwing away the protocol information of received packets, then the payload is validated as JSON, and finally a new packet is constructed to forward the contents on.

Dependencies

You will need a debug probe that supports the Serial Wire Debug (SWD) protocol.

trust-list

crates.io GitHub Actions Workflow Status MIT

Command line tool for generating a dependency information table in markdown.

Install

cargo install trust-list

Usage

$ trust-list --help
Command line tool for generating a dependency information table in markdown

Usage: trust-list [OPTIONS]

Options:
  -o, --output-file <OUTPUT_FILE>  The output filename, appended with .md [default: trust-list]
  -r, --recreate                   Recreate table [default: appends new dependencies]
  -D, --depth <DEPTH>              The depth of dependencies to collect information on [default: all sub dependencies]
  -d, --dev                        Include dev dependencies [default: excluded]
  -b, --build                      Include build dependencies [default: excluded]
  -e, --exclude <EXCLUDE>          Exclude specified workspace [default: all included]
  -h, --help                       Print help
  -V, --version                    Print version

Example

trust-list --depth 1
namedownloadscontributorsreverse_dependenciesversionscreated_atupdated_atrepository
anyhow455074655242455810205/10/201919/09/2025https://github.com/dtolnay/anyhow
chrono39363107930+174919220/11/201408/09/2025https://github.com/chronotope/chrono
clap56455292130+2592644401/03/201529/10/2025https://github.com/clap-rs/clap
field_names55654813308/01/202104/01/2022https://github.com/TedDriggs/field_names
itertools70139948030+703813021/11/201431/12/2024https://github.com/rust-itertools/itertools
pbr2835208261052414/10/201508/02/2023https://github.com/a8m/pb
reqwest30766343930+1461211516/10/201613/10/2025https://github.com/seanmonstar/reqwest
serde70166718930+5954431505/12/201427/09/2025https://github.com/serde-rs/serde
serde_json61622793030+4196017707/08/201514/09/2025https://github.com/serde-rs/json

Compliance

Restricted to one request per second as per crates.io data access policy.

redacta

crates.io GitHub Release GitHub Actions Workflow Status MIT

Command line tool for redacting information from text.

[!WARNING] This is an early stage implementation, it might not redact accurately.

Install

curl -sS https://raw.githubusercontent.com/lhalf/redacta/main/install.sh | sh

Or install via cargo:

cargo install redacta

Usage

Takes text via stdin and forwards redacted text to stdout.

$ redacta --help
Usage: <STDIN> | redacta [OPTIONS]

Options:
      --ipv4           Enable IPv4 redaction
      --ipv6           Enable IPv6 redaction
      --fqdn           Enable FQDN redaction
  -r, --regex <REGEX>  Regex redaction
  -h, --help           Print help

Example

$ echo "Look at my 192.168.0.1 IP!" | redacta --ipv4
Look at my *********** IP!
$ echo "No really, look at my 2001:db8:3333:4444:5555:6666:7777:8888 IP!" | redacta --ipv6
No really, look at my ************************************** IP!
$ echo "Okay, it is example.server.com here..." | redacta --fqdn
Okay, it is ****************** here...

Making a firmware data diode from a Β£10 network switch

What is a data diode?

A cybersecurity device designed to only allow data flow in one direction.

one_direction

Sadly, Simon Cowell has no idea what a data diode is, as they are designed for high-security environments such as:

  • Defence
  • Critical National Infrastructure
  • Financial Services
  • Energy & Utilities

And are a major component in X domain solutions, not The X Factor...

simon_cowell

Get on with it...

Why are they useful?

Data diodes are useful for bridging air-gapped networks where you might not want a network to send traffic back to the other. The green is traffic you want to get out of the network, the red is traffic you don't want to let back in!

data_diode

They are a key component in cross-domain solutions. The goal of this article is not to explain in detail the use cases of data diodes, but rather to explore the title.

Types of data diode

The behaviour of a data diode can be achieved in a variety of ways.

Hardware defined

The purest form of data diode. A piece of physical hardware that enforces uni-directional data flow, it is physically impossible for data to travel in the reverse direction.

Software defined

Software that provides uni-directional network behaviour. There is no physical hardware enforcing this behaviour, so in theory if the software is vulnerable so is the uni-directional behaviour.

Firmware defined

Uni-directional network behaviour enforced through firmware running directly on hardware; again, there is no physical hardware enforcing uni-directional behaviour. In theory, this is more secure than a software defined data diode as you often require physical access to the hardware to modify the firmware. There is also typically a reduced attack surface against firmware as it has fewer responsibilities and therefore fewer exploitable interfaces than software running on general purpose hardware.

If they are used in high-security environments, why aren't they used elsewhere?

Great question, and the answer should be there is no reason.

In practise there are a few barriers to entry that prevent widespread adoption of this technology:

  • They are more expensive than software-based solutions, such as firewalls
  • They are more complicated to integrate into certain networks

How much do they typically cost?

A quick Google search suggests the average cost of a data diode ranges from "a few thousand to tens of thousands of dollars, depending on several factors".

This cost is a high barrier to entry for most users and often doesn't justify the additional security offered over a traditional firewall.

Why are they expensive?

Hardware defined data diodes are expensive because they require custom hardware. They are the only option providing hardware guarantees on uni-directional behaviour, and are therefore favoured for high-security environments where the highest security is financial justifiable.

Can they be made cheaper?

Yes - well maybe.

Hardware defined diodes will always be the preferred choice in high-security environments, because they provide the strongest guarantees. They will likely continue to command a high price due to custom hardware, supporting the teams of experts required to design and manufacture them as well as R&D costs.

Software defined diodes are free in theory, with a solid understanding of computer networking and some coding skills (or prompting skills) you could whip up something that acts like a data diode. There are a few benefits to this approach, notably the flexibility in features and deployment. The downside being your uni-directional behaviour is only as secure as your software, which is often not very!

Firmware defined diodes, in my opinion, offer a middle ground between hardware guarantees, cost and flexibility. They do not offer hardware uni-directional guarantees; however, they do offer a significantly reduced attack surface versus software defined diodes. As this article will explore, they can also be made from cheap commodity hardware.

So, what how can we make a firmware defined data diode for Β£10?

The approach

There are many affordable unmanaged network switches on the market, retailing for as low as Β£5. These devices often contain mature and flexible switch controller chips, which have a number of configuration options for different applications. As these switches are unmanaged, these configuration options cannot be accessed remotely via a management interface - they are "plug and play". The configuration options are read from an EEPROM chip when the device is powered on, and never again after. If we want to modify the configuration options on these unmanaged switches, we need to modify the contents of the EEPROM chip read when power is applied.

What settings could we modify to achieve uni-directional transfer?

There is a number of configuration options that we could use to provide uni-directional behaviour:

Configuration optionDoes what?So what?
VLANEnables 802.1Q port-based VLAN taggingWe could prevent traffic tagged with a VID from one port reaching another in a different VLAN
Port mirroringMirrors the received packets from the one port to anotherWe could copy each packet arriving at the ingress port to the egress port
Port isolationProvides a port isolation mask for a given portWe could allow traffic from the ingress port to reach the egress, and drop all traffic reaching the egress port

Using the VLAN alone, we cannot achieve the behaviour required - the ingress port is on a different VLAN from the egress port, so ingress traffic cannot reach the egress port. We can achieve uni-directionality by combining the VLAN and port mirroring approach, traffic arrives at the ingress port and is tagged with the ingress PVID. The traffic cannot reach any other port because they are all on different VLANs. With port mirroring enabled, these packets are mirrored to the egress port and on to their destination. Any reverse traffic again is tagged by the egress PVID, but still cannot reach any other port so is dropped. There is no mirroring from egress to ingress port.

Port isolation can be used in isolation of VLAN and port mirroring configuration to achieve uni-directionality. We simply configure the port isolation mask for the ingress port to include the egress port, allowing traffic to reach the egress port. We then configure the egress port isolation mask to nothing, meaning any traffic arriving at the egress port cannot reach any other port.

We can combine all of these configuration options to minimise the risk one option has a vulnerability. All together:

ConfigurationReason
Enable port-based VLAN taggingTraffic is tagged with the PVID it arrives at, preventing traffic reaching another port
Enable dropping VLAN tagged packetsWe don't want people to send us traffic they have tagged with a VID
Enable mirroring RX trafficWe want to copy received traffic to the egress port
Disable mirroring TX trafficWe don't want to copy transmitted traffic to the egress port
Disable mirroring RX pause framesWe don't want to copy these frames to the egress port
Enable port isolation for all other ports on the ingress portWe want all traffic to be dropped at the ingress port, as the RX traffic will be mirrored to the egress
Enable port isolation for all other ports on the egress portWe want all traffic to be dropped at the egress port

The hardware

I used a TP-Link LS1005G, which I bought from Amazon for Β£10.

switch

Opening up this switch we can see the various chips.

switch_internals

  • 1 is the EEPROM chip that contains the configuration options for the switch
  • 2 is the switch chip itself, this manages packet switching and will be enforcing our uni-directional behaviour

Flashing the EEPROM

As we explored in the approach we need to modify the configuration settings stored in the EEPROM chip in order to modify the switch behaviour. We can see how the data is loaded from EEPROM when the switch chip is powered up by perusing the datasheet.

The EEPROM interface of the RTL8366 and RTL8369 uses the serial bus EEPROM Serial Management Interface (SMI). The 2K-bit 24C02 EEPROM is read via the EEPROM SMI protocol. When the RTL8366/RTL8369 is powered up, the RTL8366/RTL8369 drives SCK and SDA to read the registers from the EEPROM.

The EEPROM chip is the same type commonly found on PC motherboards for holding the BIOS, which is convenient for us as there is lots of compatible clips for the purpose of flashing these chips. I used a CH341A programmer which came bundled with the correct clip for the EEPROM chip on the switch.

eeprom_clip

  • 1 is the CH341A USB EEPROM programmer
  • 2 is the clip attachment for the EEPROM chip

eeprom_clipped

With the clip attached we can now read and program the EEPROM chip using tools such as AsProgrammer. These tools allow us to view the contents of the EEPROM as a hex dump. This isn't particularly useful unless we know how it is interpreted by the switch chip.

asprogrammer

What is in the EEPROM?

Looking in the RTL8366 datasheet we can see how the contents of the EEPROM is interpreted by the switch chip.

  • 1 the first two bytes are read, these determine when the switch chip will stop reading from EEPROM
  • 2 everything else is 2 byte pairs of register location and register value

To set new behaviour on the switch chip we therefore need to modify the first two bytes to read further in EEPROM, then we can put our additional register settings in these newly read bytes.

Great! Now we just need to know which registers to set and what the values need to be.

What registers do we need to set?

To find the registers we need to set we can peruse the Realtek unmanaged switch library. This contains some very helpful resources when finding registers including:

  • API documentation - details about the library's functions
  • Programming guide - explains the design of the library and offers guidance on how to use it

And if you're feeling lucky, you might just search for the relevant register needed in the registers header file!

Once we've found the required register for the feature we're interested in, we need to know what to put there.

What do we need to put in the registers?

In the API documentation we find:

The forwarding port mask is configured per port. Every packet switched by switch can’t be forwarded to the ports which are not in the forwarding port mask.
CPU force TX (TX portmask in CPU tag) function doesn’t be affected by Port Isolation. Packets which coming from CPU port and have β€œTX portmask” in CPU tag can be forwarding to any other ports.
Besides this, Port Isolation is the highest priority in forwarding decision.

So we need to set a port isolation mask for two ports that we want to act as our date diode, let's use port 1 and port 5.

port_isolation

We can repeat the above steps to configure other settings on the switch chip, such as port-based VLAN tagging and port mirroring.

Once we know what registers we need to configure, the value we need in each register and therefore how many extra bytes into EEPROM we need to read we have everything to flash the chip with our new behaviour!

How can we check it worked?

Using it as intended of course! Let's send some UDP traffic through it, importantly we shouldn't be able to send TCP because it is bidirectional.

I will be testing the behaviour on Linux, which will require the following tools:

ToolFor what?
netcatreading and writing UDP
wiresharkanalysing the packets being sent and received
ipmanually configuring ARP entries

The first two tools appear obvious, but you might be asking what is an ARP entry and how does it need to be manually configured...

What is an ARP table and why do we need to manually configure entries?

The Address Resolution Protocol (ARP) pretty much does what it says on the tin, it resolves IP addresses to a MAC address within a network. An ARP table is a cache that stores recent IP addresses mapped to MAC addresses. Each of these mappings is known as an ARP entry, for example an ARP entry might look like 192.168.1.10 β†’ 00:1A:2B:3C:4D:5E. Normally, ARP tables are populated automatically; however, this relies on a bidirectional exchange (sender broadcasts an ARP request, receiver replies with a MAC) - you might be seeing the problem already! With a data diode in the way there is no way for the receiver to reply with its MAC address, meaning no automatically populated ARP entry!

arp

We need to help out and manually give the ARP table entry relating to the device on the other end of the diode we want to send to. We can use the ip command on Linux to do exactly this!

sudo ip neighbour add <RECEIVING_IP> lladdr <RECEIVING_MAC> dev <NETWORK_ADAPTOR> nud permanent

Finally, we need to set static IPs on both network adaptors. The tool you use to do this is distro dependent.

The setup with static IPs looks as follows:

test_setup

Once we have manually configured the ARP entry for the receiving device on the other end of the diode we can start sending and receiving UDP packets.

To start we need to start netcat listening on the receiving device, I'm using port 5005 for reasons.

nc --listen --udp 5005

Then we can start wireshark to capture packets arriving at the network adapter on the receiving device.

Now on the sending device we can start wireshark capturing packets being sent from the sending device.

Finally, we can send packets from the sending device using netcat!

echo "forward traffic" | nc --udp 192.168.50.2 5005

On the sending machine we see packets go out...

wireshark_send

On the receiving machine we see packets arrive!

wireshark_recv

Let's try the same test with TCP packets and see what happens...

wireshark_send_tcp

We see a SYN packet go out, this is the initial handshake packet of the TCP protocol, but sadly his friends behind the data diode can't ACK back... in a desperate attempt to reach his unresponsive friends he tries to re-transmit the SYN packet many times to no avail, it's all a bit sad.

Maybe they can send something back over UDP, lets send some packets from the receiver...

echo "reverse traffic" | nc --udp 192.168.50.1 5005

On the receiver machine we can see packets go out...

wireshark_reverse_send

And on the sender machine we see...

wireshark_reverse_recv

Nothing!

Conclusion

What I hope this article has demonstrated is that a modified unmanaged network switch can operate as a firmware defined data diode with a little bit of a tweaking.

Future work

There is a number of possibilities this initial work opens up, such as:

  • Experimenting with other features on the switch chips
  • Experimenting with more complex architectures to build towards cross domain solutions
  • Experimenting with the 8 port variants

Acknowledgements

Thanks to Rene and the Open source data diode repository, where the original idea came from. I will be contributing the guide and registers for setting up a device for yourself!

Test doubles in Rust: mockall vs autospy

autospy_vs_mockall

Unit testing is one of many tools in a software engineer's arsenal of validating their code does what they think it does. Unit testing aims to validate that an individual module, function or unit does what we expect in isolation.

To achieve in isolation it is a common scenario when writing unit tests to use test doubles to mimic interfaces without relying on a real implementation.

This is such a common scenario in fact that there are many crates that endeavour to simplify this process and reduce boilerplate, cumulatively racking up millions of downloads.

mockallwiremock
fauxmockers
unimockpseudo

Arrange, Act, Assert

One thing I noticed when moving from C++ to Rust, and is evident from the above list, is that mocks tend to be the preferred test double of choice in Rust. This was interesting revelation coming from other languages where the most common test doubles are typically fakes, stubs and spies.

You might be thinking, they're all test doubles does it really make that much of a difference? The answer is yes there are some obvious and not so obvious differences between mocks and other types of test doubles that I think should be taken into consideration.

Firstly, the "Arrange, Act, Assert" test structure I had become familiar with and is touted as "best practise" didn't seem to naturally fall out of tests that use mocks. It didn't feel like there was a clear divide between which part of the test was the arrange section and which was the assert.

aaa

Test structure is just one thing that differs between use of mocks and other test doubles. There are also some functional differences that will be covered later, to illustrate the "Arrange, Act, Assert" differences let's compare a typical test structure using mocks versus spies with some example code...

Mocks: typical test structure

  • Configure the mock - this can include setting return values, specifying the expected arguments, defining call order or other expectations
  • Inject and use the mock - then assert the function under test produces the expected result
  • Panics during execution - if any of the expectations are violated, the mock will panic inside the function under test
#![allow(unused)]
fn main() {
#[cfg_attr(test, mockall::automock)]
trait SaveFile {
    fn save_file(&self, filename: &str, contents: &[u8]) -> anyhow::Result<()>;
}

fn save_file_to_disk(
    file_system: &impl SaveFile,
    filename: &str,
    contents: &[u8],
) -> anyhow::Result<()> {
    file_system.save_file(filename, contents)
}

#[cfg(test)]
mod tests {
    use super::*;
    use mockall::predicate::*;

    #[test]
    fn fails_to_save_file() {
        // Arrange & Assert ------------------------------------------------
        let mut mock = MockSaveFile::new();
        mock.expect_save_file()
            .with(eq("filename"), eq(b"contents".as_ref()))
            .times(1)
            .returning(|_, _| Err(anyhow::anyhow!("deliberate test error")));
        
        // Act -------------------------------------------------------------
        assert!(save_file_to_disk(&mock, "filename", b"contents").is_err());
    }
}
}

Spies: typical test structure

  • Configure the spy - this usually involves setting the return values
  • Inject and use the spy - then assert the function under test produces the expected result
  • Verify arguments - assert the spy was called with the expected arguments
#![allow(unused)]
fn main() {
#[cfg_attr(test, autospy::autospy)]
trait SaveFile {
    fn save_file(&self, filename: &str, contents: &[u8]) -> anyhow::Result<()>;
}

fn save_file_to_disk(
    file_system: &impl SaveFile,
    filename: &str,
    contents: &[u8],
) -> anyhow::Result<()> {
    file_system.save_file(filename, contents)
}

#[cfg(test)]
mod tests {
    use super::*;

    #[test]
    fn fails_to_save_file() {
        // Arrange --------------------------------------------------------
        let spy = SaveFileSpy::default();
        spy.save_file
            .returns
            .set([Err(anyhow::anyhow!("deliberate test error"))]);
        
        // Act ------------------------------------------------------------
        assert!(save_file_to_disk(&spy, "filename", b"contents").is_err());
        
        // Assert ---------------------------------------------------------
        assert_eq!(
            [("filename".to_string(), b"contents".to_vec())],
            spy.save_file.arguments
        ) 
    }
}
}

Advantages of spies

spy

Arrange, Act, Assert

As previously mentioned the "Arrange, Act, Assert" pattern is the expected pattern for unit tests, leading to improved readability when people drop in and out of a codebase.

Crate specific syntax

Something else of note is the reduction in crate specific syntax, in the mock example to express our expectations we needed to use expect_fn(), with(), times() and returning(). These might read as obvious to a seasoned Rust veteran, or even a standard mock user; however, there is a cognitive load in understanding what each of these do, and additional complexities in the interface that would require a fresh pair of eyes to peruse the documentation.

In the spy example we can see this reduction in crate specific syntax, with the only crate specific function being set(). You might justifiable argue that returns and arguments are conceptually part of the library and therefore crate specific, but in the lens of "Arrange, Act, Assert" they fall very clearly into one category or the other which results in the test structure remaining consistent with unit test structures common in other languages.

Does not panic in function under test

One final difference which depending on the situation can manifest as an advantage is spies don't panic during the function under test if expectations are not met. Why is this relevant? First reason, it means you don't get the nice error messages you get with Rust's asserts. For example if we take the previous mock example, except whilst I'm writing it I make a mistake and misspell "filename" as "filenam", what happens?

#![allow(unused)]
fn main() {
#[test]
fn fails_to_save_file() { 
    let mut mock = MockSaveFile::new();
    mock.expect_save_file()
        .with(eq("filenam"), eq(b"contents".as_ref()))
        .times(1)
        .returning(|_, _| Err(anyhow::anyhow!("deliberate test error")));
        
    assert!(save_file_to_disk(&mock, "filename", b"contents").is_err());
}
}
failures:

---- mock::tests::fails_to_save_file stdout ----

thread 'mock::tests::fails_to_save_file' (44145) panicked at src/mock.rs:1:18:
MockSaveFile::save_file("filename", [99, 111, 110, 116, 101, 110, 116, 115]): No matching expectation found
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

Okay, we have "No matching expectation found", but which expectation? We have nothing to compare against, and no indicator as to which argument is causing the issue. That might not seem like a big issue here where we have two relatively simple arguments, but when we have multiple arguments, or the types become complex this quickly becomes non-ideal.

Let's compare to a spy:

#![allow(unused)]
fn main() {
#[test]
fn fails_to_save_file() { 
    let spy = SaveFileSpy::default();
    spy.save_file
       .returns
       .set([Err(anyhow::anyhow!("deliberate test error"))]);
        
    assert!(save_file_to_disk(&spy, "filename", b"contents").is_err());
        
    assert_eq!(
        [("filenam".to_string(), b"contents".to_vec())],
        spy.save_file.arguments
    )
}
}
failures:

---- spy::tests::fails_to_save_file stdout ----

thread 'spy::tests::fails_to_save_file' (49891) panicked at src/spy.rs:45:9:
assertion `left == right` failed
  left: [("filenam", [99, 111, 110, 116, 101, 110, 116, 115])]
 right: [("filename", [99, 111, 110, 116, 101, 110, 116, 115])]
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

Looks like any other Rust test assert message to me!

Advantages of mocks

It wouldn't be a fair comparison without looking at some of the advantages of mocks, as there are some benefits that might swing your choice.

Line count

This isn't always the case but mocks typically require fewer lines to achieve the same functionality.

Flexibility

As previously mentioned, mocks typically come with a lot more crate specific syntax:

Lots of bells and whistles, and therefore gives you as the author a bit more flexibility and levers at your disposal to implement the tests.

Conclusion

You might have read all of this (in which case thanks!) and thought well you made autospy, of course you're going to suggest people use it... to which I say I hope this article is somewhat convincing to give spies a try πŸ˜„!

These observations might just be from the lens of someone who has always used spies and fakes, and you might be perfectly happy to continue using mocks! Power to you, but if you ever find yourself one day looking to try an alternative which maybe you might enjoy more... autospy is always here!