Establishing the Collection Structure

This section focuses on getting into the Fluency ecosystem.

Moving data from its source and getting it into Fluency requires using a Pipe. Pipes provide two critical features: Data Processing and Flow Control. The Platform Configuration is really just a pre-processing of the data before actual data processing. The Platform Configuration pipe concludes with routing the record.

To implement a stream from end-to-end, we follow these four steps.

  • Add a Data Source
  • Create a Processor
  • Create a Route and add the Processor
  • Create and Attach a Sink

The subpages will cover these actions in detail. The end result as the combination of Data Source -> Router -> Data Sink. Inside the router, additional processing such as transformation occurs.

Connecting

Data enters the system from a Data Source. We define how to connect to a data source by its protocol or type. The platform supports the following data sources:

  1. API Plugin
  2. HTTPs Event Collectot (HEC)
  3. Kinesis Stream
  4. AWS S3
  5. AWS S3 with SQS
  6. Webhook
  7. Cloud Syslog
  8. Management Queue

Notice that this is not a product list, such as CrowdStrike Falcon or SentinelOne. Connecting to a data source is just a means to move the data from the product. Products often document multiple connection means.

Parsing

Once the data source is connected, a second pipe is connected to it that will Parse the data. The parsing of data allows for it to be searched and processed. During this parsing process, data can be checked for error, types changed, look-ups made, and values normalized.

Two common parsing adjustments are:

  1. Adjusting the time to correct for timezone or abnormal clock times.
  2. Adding a User-Entity key for later correlation and scoring.

During the parsing phase, some Data Processing can occur. Most common is the creation of metric data.

Data Routing

The last step in pre-processing is routing. Most often, a record is sent to the Event Watch for further analysis and alerting. However, records that were used to create metrics or are used for investigations only, can be sent to discard or directly to storage depending on the situation.