On Premise Collectors

There are times when moving data from a on-premises location cannot be done within good security practices or operational practices. At this time a Collector might be the answer.

A collector is a server image that can be installed on an on-premise VMware host. The purpose of a collector is to address issues that can arise in collecting data from on-premises:

  • Legacy Protocols. Sometimes systems are older and do not provide a secure means to transmit data from on-premises to the cloud. For example, traditional Syslog protocol running (port UDP/514) will transit the data in the clear, meaning unencrypted. For this reason, we want it to first be sent to a collector, which will use an encrypted protocol to ensure the data is sent securely.
  • Packet Loss. Protocols that use User Datagram Protocol (UDP) are susceptible to packet loss (see below). Collectors will change the protocol to use a protocol that is not impacted by packet loss.
  • Source Authentication. To avoid data poisoning, the collector authenticates itself to the Fluency collector.
  • Increased Data Availability. While AWS provides four (4) nines of availability, an extra nine can be achieved by storing data incase of network loss.

Installing and Operating a Collector

The next sections will cover:

  • Adding
  • Updating

UDP and Packet Loss Explained

Packet Loss:

When we send data over the internet, it's broken down into small units called packets. These packets travel through various networks to reach their destination. Sometimes, due to network congestion, hardware failures, or other reasons, some packets may not make it to their destination. This is what we call packet loss.

UDP Protocol:

There are different ways to send these packets, and one of them is through a protocol called User Datagram Protocol (UDP). Unlike some other protocols like Transmission Control Protocol (TCP), UDP doesn't ensure that all packets reach their destination. It's like sending postcards without any confirmation of delivery. UDP is often used for real-time communication where a slight delay is acceptable, like in video streaming or online gaming.

How Packet Loss Affects UDP:

Because UDP doesn't guarantee delivery, packet loss can be more noticeable. Imagine you're watching a live video stream. If some packets get lost along the way, you might see glitches or freezes in the video because those missing packets contain important information. This can happen more frequently if the distance between the sender and receiver is greater, or if there's a lot of traffic on the network.

Dealing with Packet Loss:

To deal with packet loss when using UDP, we often employ strategies like error correction in the application layer or using forward error correction (FEC) techniques. Another common approach is to have a backup system in place. For instance, a collector might store incoming data from a UDP session and then transmit it using a more reliable protocol like TCP.

In summary, while UDP is great for fast communication, it comes with the trade-off of potential packet loss. By understanding this and using appropriate strategies, we can minimize its impact on our communication systems.

Source Authentication Explained

Source authentication is a security measure used to verify the identity of the sender or the source of a message or data. Imagine you receive a letter in the mail. You might want to make sure it's from the person or organization claiming to have sent it, rather than from someone pretending to be them.

In the digital world, source authentication serves a similar purpose. When you receive an email, visit a website, or interact with any online service, source authentication helps confirm that the information you're receiving or the sender you're communicating with is indeed who they claim to be.

There are various techniques and technologies used for source authentication, such as digital signatures, certificates, and cryptographic protocols. These methods involve creating unique identifiers or "signatures" that are difficult to forge and can only be generated by the genuine sender. When you receive data or messages with these signatures attached, your system can verify them to ensure they originated from the expected source.

Source authentication is crucial for maintaining trust and security in digital communications. It helps prevent various forms of cyberattacks, such as impersonation, phishing, and data manipulation, by ensuring that only authorized parties can send or receive sensitive information. By implementing source authentication measures, users can have greater confidence in the integrity and authenticity of the data they interact with online.