VPC Flow Logs – Introduction and analytics with CloudWatch Logs Insights

AWS VPC Flow Logs is an Amazon service that allows you to log traffic information between network interfaces in AWS VPC. Further, these logs can be transferred to AWS CloudWatch Logs for further analysis, while traffic logging does not affect network speed in any way.

Let’s briefly review the basic concepts, available options, and configure Flow Logs for VPC with data transfer for analysis in CloduWatch Logs.

Logs can be enabled for an entire VPC, a subnet, or a specific interface. When enabled for the entire VPC – logging will be enabled for all network interfaces of this network.

Services for which you can use Flow Logs:

  • Elastic Load Balancing
  • Amazon RDS
  • Amazon ElastiCache
  • Amazon Redshift
  • Amazon WorkSpaces
  • NAT gateways
  • Transit gateways

The data will be recorded as flow log recordswhich are a text record with specified fields.

Use Cases – examples of use

What can be tracked with Flow logs?

  • activation of SecuirtyGroup/Network Access List rules – blocked requests will be marked as REJECTED
  • what we implement logs for – to get a picture of traffic between VPC and services, to understand who consumes the most traffic, where and how much cross-AZ traffic, etc.
  • monitoring remote logins to the system – monitor ports 22 (SSH), 3389 (RDP)
  • port scan tracking

Flow Log record – fields

Each entry in the logs contains data about the IP traffic received by aggregation interval and is a string of fields separated by spaces, where each field contains information about the data transfer, such as Source IP, Destination IP, and protocol.

By default, the following format is used:

${version} ${account-id} ${interface-id} ${srcaddr} ${dstaddr} ${srcport} ${dstport} ${protocol} ${packets} ${bytes} ${start} ${end} ${action} ${log-status}

See table Available fields in documentation – everything in the Version 2 column is included in the default format. Other versions are available in Custom Format.

When creating Flow Logs, we can use the default format or create our own – we will consider it a little below:

Limitations of VPC Flow Logs

  • cannot be used with EC2-Classic instances
  • you cannot create logs for VPC peerings if they lead to a VPC in another account
  • after creating a log, you cannot change its configuration or record format
  • if there are several IPv4 addresses in the interface, and the traffic is sent to one of the secondary addresses, the primary address will be displayed in the dstaddr field; to get the original address – use the pkt-dstaddr field
  • if traffic is sent from or to a network interface, the srcaddr and dstaddr fields will contain its primary private IPv4 address; to get the original address – use the pkt-dstaddr and pkt-dstaddr fields

Also, consider that:

  • records are not logged to Amazon’s DNS, but are written if you use your own DNS
  • traffic to and from the address 169.254.169.254 is not logged for receiving EC2 instance metadata
  • traffic between the EC2 network interface and the AWS Network Load Balancer interface is not logged

See all restrictions on Flow log limitations.

To create, we need to specify:

  • the resource whose logs will be written – a VPC, a subnet or a specific network interface
  • the type of traffic we log (accepted traffic, rejected traffic or all traffic)
  • and where we will write the data – to the trash or CloudWatch Logs

For now, we’ll see what happens with CloudWatch Logs, and next time we’ll try to visualize in Kibana or Grafana.

CloudWatch Logs Log Group

We create a Log Group:

Creation of IAM Policy and IAM Role

In order for the Flow Logs service to be able to write to our CloudWatch, we configure its access rights.

Go to AWS IAM, create an IAM Policy and an IAM Role.

Let’s start with Policy:

We add the policy itself:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Action": [
        "logs:CreateLogGroup",
        "logs:CreateLogStream",
        "logs:PutLogEvents",
        "logs:DescribeLogGroups",
        "logs:DescribeLogStreams"
      ],
      "Effect": "Allow",
      "Resource": "*"
    }
  ]
}

We store:

We create a role.

Go to IAM Roles, create a new one, select the EC2 type:

We find the policy created above, connect:

Set a name, save:

Let’s go to Role Trust relationshitp (see AWS: IAM AssumeRole – description, examples), edit – change the value of the field Service on vpc-flow-logs.amazonaws.com:

We indicate:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "",
      "Effect": "Allow",
      "Principal": {
        "Service": "vpc-flow-logs.amazonaws.com"
      },
      "Action": "sts:AssumeRole"
    }
  ]
}

We store:

VPC – Enable Flow Logs

And finally, we proceed to the inclusion of logs – we find the necessary VPC, click Flow Logs > Create:

We set the name, Filter, Interval:

U Destination select CloudWatch Logs, specify the previously created Log Group and IAM Role:

Format – we leave it Default.

We check the Status:

And in a few minutes – the data went:

In the log group, the first stream with the name Elastic Network Interface appeared, from which data is taken:

Let’s take a quick look at what’s available to us in Logs Insights.

Let’s click Queries for a query syntax hint:

For example, we will get the top 15 hosts by the number of packages:

Or by the amount of data sent:

stats sum(bytes) as BytesSent by srcAddr, dstAddr
| sort BytesSent desc

Okay, what about other formats?

For example, you want to see the direction of the request (egress/ingress) and the value of the field pkt-dstaddr.

See examples on the Flow log record examples page.

We use the following format:

region vpc-id az-id subnet-id instance-id interface-id flow-direction srcaddr dstaddr srcport dstport pkt-srcaddr pkt-dstaddr pkt-src-aws-service pkt-dst-aws-service traffic-path packets bytes action

In CloudWatch Logs, create a new Log group, name it bttrm-eks-dev-1-21-vpc-fl-customlet’s not forget about retention, so that the data does not lie forever (well, let’s not forget that CloudWatch is not the cheapest service):

Return to the VPC, find the required network, create a new Flow Log, name it bttrm-eks-dev-1-21-vpc-fl-custom:

Select Custom Format and the fields you want to record. At the same time, take into account that the order of the fields in the logs will be the one you use when selecting fields.

That is, if you first click on “region” – then he will go first in the logs:

It turns out like this:

${region} ${vpc-id} ${az-id} ${subnet-id} ${instance-id} ${interface-id} ${flow-direction} ${srcaddr} ${dstaddr} ${srcport} ${dstport} ${pkt-srcaddr} ${pkt-dstaddr} ${pkt-src-aws-service} ${pkt-dst-aws-service} ${traffic-path} ${packets} ${bytes} ${action}

Flow Log Custom format та CloudWatch Logs Insights

But if we now go to Logs Insights and try any of the previous queries, we will get the fields that we wanted:

That is, we can see the data, but how to divide the fields into columns?

We are unlikely to use CloudWatch Logs extensively, most likely in production the data will go to S3 and then to ELK (logz.io), so I will not dwell here in detail, but we will see the principle of operation.

By default, CloudWatch Logs creates several meta fields that we can use in queries:

  • @message: “raw” data – the entire message in text
  • @timestamp: event time
  • @logStream: stream name

For Custom format, use the command to create fields parsewith which we pass the field @message with all the content, and then we parse it on fields that are separated by spaces:

parse @message "* * * * * * * * * * * * * * * * * * *" 
| as region, vpc_id, az_id, subnet_id, instance_id, interface_id, 
| flow_direction, srcaddr, dstaddr, srcport, dstport, 
| pkt_srcaddr, pkt_dstaddr, pkt_src_aws_service, pkt_dst_aws_service, 
| traffic_path, packets, bytes, action 
| sort start desc

Here the number of “*” in @message should be equal to the number of field names we specify – ${vpc-id} etc.

Also, field names must not contain dashes. That is, the original name of the field ${vpc-id} to display the name of the column, specify as vpc_id (or vpcID – who likes which format better).

We check:

Another matter!

Apart from parsewe can use commands like filter, display, stats. See all in CloudWatch Logs Insights query syntax.

Examples of Logs Insights

Well, let’s try to display something, for example – get all blocked SecuirtyGroup/Network Access List requests – they will be marked as REJECTED.

To our request:

parse @message "* * * * * * * * * * * * * * * * * * * * * *" 
| as start, end, region, vpc_id, az_id, subnet_id, instance_id, interface_id, 
| flow_direction, srcaddr, dstaddr, srcport, dstport, protocol, 
| pkt_srcaddr, pkt_dstaddr, pkt_src_aws_service, pkt_dst_aws_service, 
| traffic_path, packets, bytes, action

Let’s add:

  • filter action="REJECT"
  • stats count(action) as redjects by srcaddr
  • sort redjects desc

Here:

  • filter by action on the package – select everything REJECTED
  • we count the number of entries in the action field, choosing by IP address of the source, display in a column redjects
  • and sort by column redjects

That is, the complete request will now be:

parse @message "* * * * * * * * * * * * * * * * * * *" 
| as region, vpc_id, az_id, subnet_id, instance_id, interface_id, 
| flow_direction, srcaddr, dstaddr, srcport, dstport, 
| pkt_srcaddr, pkt_dstaddr, pkt_src_aws_service, pkt_dst_aws_service, 
| traffic_path, packets, bytes, action 
| filter action="REJECT" 
| stats count(action) as redjects by srcaddr 
| sort redjects desc

We get:

We can also use negative filters and combine filter conditions with operators and/or.

For example, remove from the output all IPs that begin with 162.142.125 – we add filter (srcaddr not like "162.142.125."):

...
| filter action="REJECT"
| filter (srcaddr not like "162.142.125.")
| stats count(action) as redjects by srcaddr
| sort redjects desc

See Sample queries.

And let’s add a filter to select only incoming requests – flow_direction == ingress:

...
| filter action="REJECT"
| filter (srcaddr not like "162.142.125.") and (flow_direction like "ingress")
| stats count(action) as redjects by flow_direction, srcaddr, dstaddr, pkt_srcaddr, pkt_dstaddr
| sort redjects desc

We get the top of rejected requests – the SecurityGroup or VPC Network Access List rule worked.

And let’s see what kind of IP you have dstaddr – who was the package that was blocked going to?

Let’s go to EC2 > Network Interfaceslooking for Private IP:

We find “Elastic IP address owner“:

LoadBalancer.

If the address is not in AWS – perhaps it is some endpoint in Kubernetes – we search for example like this:

kubectl get endpoints –all-namespaces | grep 10.1.55.140

dev-ios-check-translation-ns                     ios-check-translation-backend-svc                    10.1.55.140:3000                                                     58d

dev-ios-check-translation-ns                     ios-check-translation-frontend-svc                   10.1.55.140:80                                                       58d

In general, that’s all.

In the next part, we will configure the collection of logs in AWS S3, then we will collect them from there in ELK, and there we will try to make visualization and alerts.

VPC Flow Logs

CloudWatch Logs


Also published on Medium.

Amazon Web Services,CloudWatch,HOWTO’s,Monitoring,Networking,Security,Virtualization,AWS,AWS VPC,monitoring,

#VPC #Flow #Logs #Introduction #analytics #CloudWatch #Logs #Insights

Leave a Comment

Your email address will not be published. Required fields are marked *