Preprocessing of TCP data prior to training of Neural Network

2 views (last 30 days)
Hi everyone,
I am about to use a neural network fitting tool to train a TCP data set. I have exported this files in CSV format from wire shark. Please, how may I present this data as input?
"203","0.601657","192.95.27.190","71.126.222.64","ICMP","60","Echo (ping) request id=0xc3d2, seq=1280/5, ttl=207 (no response found!)"
"204","0.603175","40.75.89.172","71.126.222.64","ICMP","60","Echo (ping) request id=0x4ba4, seq=1280/5, ttl=205 (no response found!)"
"205","0.605941","192.95.27.190","71.126.222.64","TCP","48","50738 > 35451 [SYN] Seq=0 Win=64512 Len=0 MSS=1452 SACK_PERM=1"
Actually, It is a DDoS data sets, my interest is to split the data into train and Test for the neural network to learn, It is an array of data with the format below. I guess the TCP is more useful, I just need direction on how to input the data, If I have to restructure it in excel or any other way.
"No.","Time","Source","Destination","Protocol","Length","Info"
"1","0.000000","202.1.175.252","71.126.222.64","ICMP","60","Echo (ping) request id=0xce1d, seq=1280/5, ttl=199 (no response found!)"
"2","0.000954","192.120.148.227","71.126.222.64","ICMP","60","Echo (ping) request id=0x0200, seq=1280/5, ttl=208 (no response found!)"
"3","0.004958","51.173.229.255","71.126.222.64","ICMP","60","Echo (ping) request id=0x0200, seq=1280/5, ttl=207 (no response found!)"
"4","0.007834","40.75.89.172","71.126.222.64","ICMP","60","Echo (ping) request id=0x4ba4, seq=1280/5, ttl=205 (no response found!)"
"5","0.014597","202.1.175.252","71.126.222.64","ICMP","60","Echo (ping) request id=0xce1d, seq=1280/5, ttl=199 (no response found!)"
"6","0.014933","192.95.27.190","71.126.222.64","ICMP","60","Echo (ping) request id=0xc3d2, seq=1280/5, ttl=207 (no response found!)"
"7","0.015303","192.120.148.227","71.126.222.64","ICMP","60","Echo (ping) request id=0x0200, seq=1280/5, ttl=208 (no response found!)"
"8","0.015988","51.173.229.255","71.126.222.64","ICMP","60","Echo (ping) request id=0x0200, seq=1280/5, ttl=207 (no response found!)"
"9","0.023646","192.95.27.190","71.126.222.64","ICMP","60","Echo (ping) request id=0xc3d2, seq=1280/5, ttl=207 (no response found!)"
"10","0.025495","40.75.89.172","71.126.222.64","ICMP","60","Echo (ping) request id=0x4ba4, seq=1280/5, ttl=205 (no response found!)"
"11","0.029376","51.173.229.255","71.126.222.64","ICMP","60","Echo (ping) request id=0x0200, seq=1280/5, ttl=207 (no response found!)"
"12","0.032543","202.1.175.252","71.126.222.64","ICMP","60","Echo (ping) request id=0xce1d, seq=1280/5, ttl=199 (no response found!)"
"13","0.033090","192.120.148.227","71.126.222.64","ICMP","60","Echo (ping) request id=0x0200, seq=1280/5, ttl=208 (no response found!)"
"14","0.034366","192.95.27.190","71.126.222.64","ICMP","60","Echo (ping) request id=0xc3d2, seq=1280/5, ttl=207 (no response found!)"
"15","0.046561","51.173.229.255","71.126.222.64","ICMP","60","Echo (ping) request id=0x0200, seq=1280/5, ttl=207 (no response found!)"
"16","0.047110","202.1.175.252","71.126.222.64","ICMP","60","Echo (ping) request id=0xce1d, seq=1280/5, ttl=199 (no response found!)"
"17","0.048551","192.120.148.227","71.126.222.64","ICMP","60","Echo (ping) request id=0x0200, seq=1280/5, ttl=208 (no response found!)"
"18","0.054331","40.75.89.172","71.126.222.64","ICMP","60","Echo (ping) request id=0x4ba4, seq=1280/5, ttl=205 (no response found!)"
"19","0.055427","192.95.27.190","71.126.222.64","ICMP","60","Echo (ping) request id=0xc3d2, seq=1280/5, ttl=207 (no response found!)"
"20","0.055514","51.81.166.201","71.126.222.64","ICMP","60","Echo (ping) request id=0xef41, seq=1280/5, ttl=202 (no response found!)"
  2 Comments
Walter Roberson
Walter Roberson on 6 Jun 2015
What information do you want to extract from it? What are you planning to "fit" ?
Note that the amount of data depends upon whether the packet is ICMP, TCP, or UDP.
Walter Roberson
Walter Roberson on 6 Jun 2015
Okay, take the line
"4","0.007834","40.75.89.172","71.126.222.64","ICMP","60","Echo (ping) request id=0x4ba4, seq=1280/5, ttl=205 (no response found!)"
for example. What do you want the extracted information to be?

Sign in to comment.

Answers (1)

Image Analyst
Image Analyst on 6 Jun 2015
Since it's not a table of numbers only, but some columns have text, you can't use csvread() and you're going to have to use readtable() (probably the easiest way), or textscan(). Attach a few lines of a sample CSV file if you want people to try some code on it.

Categories

Find more on Deep Learning with GPU Coder in Help Center and File Exchange

Tags

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!