Data capture and analysis system based on TCP / IP network

时间:2022-09-29 11:41:50

Abstract. In the current environment of Ethernet, almost all the hosts complete the communication between hosts rely on TCP / IP protocol suite. Using the packet capture and analysis technology based on TCP / IP protocol suite, the host can receive all packets that flow through its own Ethernet interface without regard the contents of destination address in the packet header. This paper introduces the working principle about the capture and analysis system based on TCP / IP networks, and the capture tools and functions to the network packet based on WinPcap. Also illustrates the analytical methods about the captured packet.

Key words: TCP / IP, WinPcap, Data Capture, Data Analysis

1. Introduction

TCP / IP (Transmission Control Protocol / Internet Protocol) is the most popular network protocol which is not based on a specific platform of operating system, and the analysis of data structure based on TCP / IP protocol is the foundation of the Internet and Web Application Development. Methods of data capture and analysis on network play an important role in analyzing the reasons for abnormal network, which is an important means of gathering the operating data and analyzing running state on the network system. Network data capture and analysis become the most common method analysis system for network data based on TCP / IP networks, because this method can provide operating conditions about the network on different protocol levels.

2. The key technology of the capture and analysis system on network packet

2.1 TCP / IP protocol suite

TCP / IP is known as network communication protocols, which mainly composed of IP protocol in the network layer and TCP protocol at the transport layer, and it is also the basic protocol and the foundation in the Internet.[1]

TCP provides a reliable data stream service, and a technique which called "positive acknowledgment with retransmission" is used to achieve reliable transmission.[2] TCP also uses method called "sliding window" for flow control, and the above mentioned "window" actually represent the ability to receive which is used to limit the transmission rate of the sender. Figure 1 shows the specific format of header on TCP protocol.

IP (Internet Protocol) protocol is designed in order to communicate with the computer network interconnection, and it is an unreliable, connectionless internet protocol. IP protocol defines the basic unit and formats of the data that transmitted on the Internet. Software-based IP protocols select the data transfer path by routing algorithm in order to complete the routing function, and at the same time IP protocol also includes a set of rules for the unreliable packet delivery, the packet processing, the error message and the grouping rules. Figure 2 shows the specific format of header on IP protocol.

2.2 Principle of data capture

In an actual network environment based on TCP / IP protocol, NIC (network interface card) is used to send and receive the data of network. Each NIC has a globally unique 48-bit hardware address; this hardware address is called MAC (Media Access Control) address which is used to uniquely identify each device on the network.

Currently, the data transmission of Ethernet based on the principles of "sharing", all the computers belonging to the same local network will receive the same data packet. NIC integrated a kind of hardware device which is called "filter”, this "filter" can ignore the information does not comply with NIC’s own MAC address, and this filter will ignore all network information that unrelated with itself.

Generally, there are four receive modes for NIC: [3]

1) Broadcast mode: The NIC working in this mode can receive all the network broadcast information.

2) Multicast mode: In this mode, the NIC can receive the multicast data.

3) Direct mode: In this mode, only the NIC which has to the destination address can receive the data in the network.

4) Promiscuous mode: The NIC working on the Promiscuous mode can receive all through its data, and the NIC ignore these data are passed to itself or not.

The packet capture and analysis technology based on TCP / IP protocol is the use of this feature, when the NIC is set to "promiscuous mode", it can receive the entire packet that transmit in the Ethernet.

Based on the above description, we can summarize the basic principles of data capture as follows: Broadcast is used to transmit data based on TCP / IP protocol Ethernet; all data signals must be transmitted to all hosts. At the same time, The NIC working on the Promiscuous mode can receive all through its data and ignore these data are passed to it or not. In conclusion, if we want to achieve the capabilities of packet capture and analysis in the Ethernet environment based on TCP / IP, we need to follow the steps. First, we must set the NIC to promiscuous mode, which intercepts all packets flowing through the NIC. Second, setting the NIC into promiscuous mode just completed the capture function, and analysis of data packets we need to identify the first portion of the packet according to pre-defined rules.

2.3 WinPcap

WinPcap(Windows Packet Capture) is a free, public network access system on the Windows platform, and it provides the ability to access network infrastructure for win32 application. [4]WinPcap consists mainly of the core of packet filtering, an underlying dynamic link library, a high-level system library, and an application program interface can be used to access the packet directly.

WinPcap provides a standard set of network packet capture interface to programmer, and because it is compatible with Libpcap, many Linux platform network security program can be quickly ported to the Windows platform. WinPcap realize the process of capturing and filtering of packet in the kernel layer with fully considered the conditions of the various optimizations about the performance and the efficiency.

NPF(Net group Packet Filter) is the core part of "Winpcap",and it is a component of Winpcap in order to complete the difficult work. NPF handles data packets that transmitted over the network, and provides capture, injection and analysis capabilities for user-level.

WinPcap provides the following functions:[3]

1) Capture the original packet, including the sending and receiving data packets on the shared network, and the data packets exchanged between hosts.

2) Before the datagram is send to the application, filter some special datagram according to the custom rule.

3) Send the original datagram on the network, and collect statistics information during the process of network communication.

3. Process of data capture

WinPcap capture network data in two ways based on TCP / IP network environment: One way is using the callback mode, first the callback function will be executed when the network is in the timeout occurs or the buffer is full, then the original packet which is collected will be sent to the user, the data buffer contains multiple packets using this method.

The basic process of the packet capture is as follows using WinPcap: first, get the list of NIC devices and select a NIC card requires listening from the list, the NIC has been chosen must be set to promiscuous mode, also need to set the parameters about the filter; then copy the packets collected from the NIC to the kernel buffer; Finally, using the method of call by the upper layer, copy the packet data in the kernel buffer to the user buffer, then need for processing and extraction the data of information through the application. Data frame that captured by using WinPcap driver is Ethernet data frames which is a kind of package style that generated by the transport layer, the network layer and the data link layer, so the data frame can be resolved to get useful information further.

When the packets transmit to the capture application over network, NIC driver intercepts the packets; Packet filter calls the library function of NDIS interface, and the library function will copy the packets to the kernel buffer which has been set previously; then the data in the kernel buffer will be copy into the user interface functions through the buffer by the application through the interface function. After completing a few of these processes, we can realize the function of network data captured. Moreover, in the process of a single call for the read request; multiple data packets in the kernel buffer can be read into the user buffer. When the packets transmit from the network to the capture application, the process is the opposite. What is unusual is that in the process of a single call for the write request, there only one packet can be written to the kernel buffer from the user buffer.

Specific steps and related functions using WinPcap to capture the packet on the network are as follows:

1) Open the specified NIC: WinPcap provides the function named pcap_findalldevs () to get the contents of the network interface which is configured on the currently machine. All information about the interface is stored in a linked list named pcap_if structure, and each element of the list contains the comprehensive information of the card. The function named pcap_open_live (const char * device, int snaplen, int promisc, int to_ms, char * ebuf) is specifically intended for opening the NIC equipment. The value of int promisc must equal to 1 in order to set the NIC to promiscuous mode. If the function call is successful, it will return a handle of the specified card.[4]

2) Set filtering rules: Users can set the appropriate filters as needed, for example, receives only UDP or TCP packet. The key setting the filtering rules is correctly configuring the two functions which named pcap_compile(pcap_t *p,struct bpf_program *fp, char *str,int optimize,bpf_u_int32 netmask) and pcap_setfilter(pcap_t *p,struct bpf_program *fp).

3) Capture and parse the network packet: WinPcap provides several functions to adapt the different situations; some situations require the support of callback function, in some cases supports non-blocking mode, and users can choose the function according to the actual circumstances and needs. For example, the function named Pcap_loop(pcap_t *p,int cnt,pcap_handler callback,u_char *user)support callback mode, then this function can be used to capture the packets for analysis.

4. Analytical methods on the captured data

The captured data is able to provide useful data by parsing the protocol analysis. Data analysis process is a bottom-up parsing according to a hierarchy of data which is link layer, network layer, transport layer and application layer, and the results need to display and output. The conversion process must be noted that a variety of protocol header fields from network byte order to host byte order. For multi-byte data (such as short, int, long, etc.), different types of CPU will use the different program on the byte order. Different types of CPU use different programs on the byte order when it needs to represent a number in its own internal; when the low-order byte is stored in the start address, we call it “small endpoint”; conversely, it will be called “large endpoint”. There are different types of computers in the network, during the process of data transmission, unified the indication method of number is very important. Currently, we generally use the large endpoint. For using large endpoint reasons, multi-byte data fields in the packet header which is used to represent the length or type need to convert the representation type according to the local host. We can use the function that named ntohs() or ntohl() to change the double-byte or four-byte integer from the byte order to the host byte order.

Each layer is logically independent in network architecture, and the function between layers has obvious demarcation lines. There are standard interfaces between the adjacent of each layer, and the interface defines the operation of the service which is provided from the high level to low level. When the application send data through the network, the data will be send to TCP / IP protocol stack, then the data flow through the each layer from top to bottom, until the end, the data enter the network which is treated as a string of bit stream. When the data flow through each layer, they are needed to add some information in their header or the tail, and this process is called encapsulation. The bit streams that transmitted through the Ethernet are called frames. In the other end of the transmission, when the destination host receives an Ethernet frame, Data will be parsed using TCP / IP protocol stack from the high level to low level in order to remove the packet header which is added by the protocol of each layers. The protocol of each layer need to check the identifier field on the packet header in order to determine the upper layer protocol that need to receive the data; eventually, the application layer data which is parsed from the packet will be send to the process of application.

5. Conclusion

Currently, WinPcap has been widely used as an efficient tool for network packet capture in a variety of web development tools based on TCP / IP protocol. WinPcap is a set of powerful interface that developed based on the NDIS; these interfaces provide users with a new way to operate for the underlying network packets. Capturing and analyzing of the data packets is the basis of all network tools to develop web applications, and the network packet capture and analysis applications are also pervasive. No doubt we know, as the key technology about Network tool development, capturing and analyzing of the data packets has become one of the focus on the development of web applications.

6. Acknowledgement

This research was supported by 2013 innovation and entrepreneurship training for college students belong to Shenyang Institute of Engineering.(Grant NO. 201311632060)

References

[1] YANG Xiaojin, ZHANG Yu, XU Jiren, LU Jiuming ,Analysis of Protocols TCP/IP and OSI/RM, Computer Engineering, vol. 27 ,No. 10,pp.263-267

[2] ZHANG Daxing, The Design of Network Datagram Analysis System, Computer Engineering, vol. 27 ,No. 10, pp. 192-194, 2001.

[3] SHEN Hui, ZHANG Long, Network Data Monitoring and Analysis Based on WinPcap,Computer Science,vol. 29 ,No. 7 Supp,PP.15-18,2012

[4] Zhang Wei, Wang Tao, etc. Data packet capture and applications based on WinPcap, Computer Engineering and Design, vol. 29 ,No. 7 pp. 1649-1651, 2008.

上一篇:贵州师范大学非体育专业学生篮球运动兴趣的调... 下一篇:信息化和自动化技术在轮胎行业的应用探讨