Sockets and Network Programming in C

In this hyper-connected electronic world, knowing how to send and receive data remotely with sockets is crucial. In this article, we will see how a socket is essentially a digital “plug” that we can attach to a local or remote address in order to establish a connection. We will also explore the architecture and system calls that allow us to create not only a client but also a server in the C programming language.

What are Sockets?

Everyone has probably heard the saying that in Unix systems, “everything is a file”. Sockets are no exception. Indeed, a socket is simply a file descriptor that enables remote communication. There are several different types of sockets, but in this article, we will concentrate on Internet sockets.

Of course, there are also many types of Internet sockets which all have different ways to transmit data. Among them, the two main types are:

We’ve had the chance to explore both of these protocols, TCP and UDP in the article about the network layers of the Internet. In this article, we will mainly focus on stream sockets, and see how we can use them for remote communication.

The Importance of Byte Order

Whenever we wish to send and receive data from one computer to another, we must keep in mind that systems can represent their data in two distinct and opposite ways. Take for example the hexadecimal integer 2F0A (which is 12042 in decimal). Because of its size, this integer must be stored over two bytes: 2F and 0A .

It is logical to assume that this integer will always be stored in this order: 2F , followed by 0A . This is the most common ordering, known as “big endian” since the big end of the number, the most significant byte, is stored first. But this is not always the case…

In some systems, particularly those with an Intel or Intel-compatible processor, prefer storing the bytes of our integer in the opposite order, with the least significant, or small end, first: 0A followed by 2F . We call this ordering “little endian”.

 First Byte Second Byte Big Endian  - hexadecimal 2F 0A  - binary 00101111 00001010  Little Endian  - hexadecimal 0A 2F  - binary 00001010 00101111 

This potentially incompatible difference between host systems can of course cause some issues during data transfer.

The Network Byte Order is always big endian. But the Host Byte Order can be either big endian, or little endian, depending on its architecture.

Converting to and from Network Byte Order

Thankfully, we can simply assume that the host system is not storing its bytes in the right order compared to the network. All we then need to do is to systematically reorder these bytes when we transfer them between the network and host systems. For this, we can make use of four handy functions from the library:

uint32_t htonl(uint32_t hostlong); //"Host to network long"  uint16_t htons(uint16_t hostshort); //"Host to network short"  uint32_t ntohl(uint32_t netlong); //"Network to host long"  uint16_t ntohs(uint16_t netshort); //"Network to host short" 

As we can see, these functions come in two variants: the ones that convert a short (two bytes, or 16 bits), and those that convert a long (four bytes or 32 bits). They also work for unsigned integers.

In order to convert a four byte (32 bit) integer from the Host Byte Order to the Network Byte Order, we’ll want to call the htonl() function (“htonl” stands for “Host to Network Long”). For the opposite operation, we’d use ntohl() (“Network to Host Long”).

With this word of warning in mind, we can now turn to the issue of establishing a connection within our program.

Preparing a Connection

Whether our program is a server or a client, the first thing we need to do is prepare a small data structure. This structure will contain the information that our socket will need: notably, the IP address and the port to connect to.

Structures for the Connection IP Address and Port

The basic structures we need to use in order to hold the IP address and port we want to connect to can be found in the library. There are two variants of them: one for IPv4 and one for IPv6.

For an IPv4 Address

For an IPv4 address, we will use the sockaddr_in structure, which is defined as follows:

// IPv4 only (see sockaddr_in6 for IPv6)  struct sockaddr_in   sa_family_t sin_family;  in_port_t sin_port;  struct in_addr sin_addr; >; struct in_addr   uint32_t s_addr; >; 

Let’s take a closer look at what we need to supply to this structure:

There is only one field to fill out in the in_addr structure: s_addr. It is a Network Byte Order integer that represents an IPv4 address. We will examine below how to convert an IP address into an integer. However, there are a few constants that we could use here (without forgetting to convert the order of their bytes with htonl() !):

For an IPv6 Address

A similar structure exists to specify an IPv6 address:

// IPv6 only (see sockaddr_in for IPv4)  struct sockaddr_in6   sa_family_t sin6_family;  in_port_t sin6_port;  uint32_t sin6_flowinfo;  struct in6_addr sin6_addr;  uint32_t sin6_scope_id; >; struct in6_addr   unsigned char s6_addr[16]; >; 

This structure, sockaddr_in6 , expects the same information as the previous IPv4 structure. We won’t linger on the two new fields, sin6_flowinfo and sin6_scope_id , since this is an introduction to sockets.

Just like for IPv4, there are global variables that we can give as the IPv6 address in the in6_addr structure: in6addr_loopback and in6addr_any .

Converting an IP Address to an Integer

An IPv4 address such as 216.58.192.3 (or an IPv6 address like 2001:db8:0:85a3::ac1f:8001 ) is not an integer. It’s a string of characters. In order to convert this string to an integer that we can use in one of the previous structures, we need to call a function from the library: inet_pton() (“pton” stands for “presentation to network”).

int inet_pton(int af, const char * src, void *dst); 

Let’s take a closer look at its parameters:

The inet_pton() function returns:

To convert an IPv4 address, we can do something like this:

// IPv4 only  struct sockaddr_in sa; inet_pton(AF_INET, "216.58.192.3", &(sa.sin_addr)); 

For an IPv6 address:

// IPv6 only  struct sockaddr_in6 sa; inet_pton(AF_INET6, "2001:db8:0:85a3::ac1f:8001", &(sa.sin6_addr)); 

Of course, the opposite function exists as well, inet_ntop() (“ntop” meaning “network to presentation”). It allows us to convert an integer into a legible IP address.

But what if we don’t actually know the IP address we want to connect with? Maybe we only have a domain name such as http://www.example.com …

Automatically Fill In the IP Address with getaddrinfo()

If we don’t know the precise IP address we wish to connect to, the getaddrinfo() function from the library will be able to help us. Among other things, it allows us to supply a domain name ( http://www.example.com ) instead of an IP address. Calling getaddrinfo() will have a slight performance cost since it usually needs to check the DNS to fill out the IP address for us. Its prototype is as follows:

int getaddrinfo(const char *node, const char *service,  const struct addrinfo *hints,  struct addrinfo **res); 

This function is not as complicated as it may seem. It’s parameters are:

The getaddrinfo() function returns 0 on success, or an error code on failure. The gai_strerror() can translate the error returned by getaddrinfo() into a readable string.

In addition, the freeaddrinfo() function allows us to free the memory of the addrinfo once we’re done with it. Let’s examine that structure now.

The addrinfo Structure

Two of getaddrinfo() ’s parameters, hints and res, are pointers towards the same type of structure. So let’s try to understand it:

struct addrinfo   int ai_flags;  int ai_family;  int ai_socktype;  int ai_protocol;  size_t ai_addrlen;  struct sockaddr *ai_addr;  char *ai_canonname;  struct addrinfo *ai_next; >; 

The addrinfo structure contains the following elements:

Getaddrinfo() Example

Let’s write a small program that prints the IP addresses for a domain name:

// showip.c -- a simple programme the shows a domain name's IP address(es)  #include  #include #include #include  int main(int ac, char **av)   struct addrinfo hints; // Hints or "filters" for getaddrinfo()  struct addrinfo *res; // Result of getaddrinfo()  struct addrinfo *r; // Pointer to iterate on results  int status; // Return value of getaddrinfo()  char buffer[INET6_ADDRSTRLEN]; // Buffer to convert IP address   if (ac != 2)   fprintf(stderr, "usage: /a.out hostname\n");  return (1);  >   memset(&hints, 0, sizeof hints); // Initialize the structure  hints.ai_family = AF_UNSPEC; // IPv4 or IPv6  hints.ai_socktype = SOCK_STREAM; // TCP   // Get the associated IP address(es)  status = getaddrinfo(av[1], 0, &hints, &res);  if (status != 0)  // error !  fprintf(stderr, "getaddrinfo: %s\n", gai_strerror(status));  return (2);  >   printf("IP adresses for %s:\n", av[1]);   r = res;  while (r != NULL)   void *addr; // Pointer to IP address  if (r->ai_family == AF_INET)  // IPv4  // we need to cast the address as a sockaddr_in structure to  // get the IP address, since ai_addr might be either  // sockaddr_in (IPv4) or sockaddr_in6 (IPv6)  struct sockaddr_in *ipv4 = (struct sockaddr_in *)r->ai_addr;  // Convert the integer into a legible IP address string  inet_ntop(r->ai_family, &(ipv4->sin_addr), buffer, sizeof buffer);  printf("IPv4: %s\n", buffer);  > else  // IPv6  struct sockaddr_in6 *ipv6 = (struct sockaddr_in6 *)r->ai_addr;  inet_ntop(r->ai_family, &(ipv6->sin6_addr), buffer, sizeof buffer);  printf("IPv6: %s\n", buffer);  >  r = r->ai_next; // Next address in getaddrinfo()'s results  >  freeaddrinfo(res); // Free memory  return (0); > 

When we run this program with a domain name as an argument, we receive the associated IP address(es) as a result:

Result of a program that prints the IP addresses associated with a domain name thanks to getaddrinfo().

Now that we know how to get the IP address and store it in the appropriate structure, we can turn our attention to preparing our socket in order to actually establish our connection.

Preparing Sockets

At last, we can create the file descriptor for our socket. With it, we will be able to read and write in order to respectively receive and sent data. The system call from the library, simply named socket() , is what we need! This is its prototype:

int socket(int domain, int type, int protocol); 

The parameters it requires are as follows:

The socket() function returns the file descriptor of the new socket. In case of failure, it returns -1 and indicates the error it encountered in errno .

In practice, we probably won’t be filling out the socket() function’s parameters manually. Not when we can simply indicate the values returned by getaddrinfo() , a little like this:

int status; int socket_fd; struct addrinfo hints; struct addrinfo *res;  // fill out hints to prepare getaddrinfo() call  status = getaddrinfo("www.example.com", "http", &hints, &res); // check if getaddrinfo() failed  socket_fd = socket(res->ai_family, res->ai_socktype, res->ai_protocol); // check if socket() failed 

But this socket file descriptor is not yet connected to anything at all. Naturally, we will want to associate it to a socket address (meaning an IP address and port combination). For this, we have two choices:

Diagram of network programming with sockets for a server and a client. The server uses the fuctions socket, bind, listen, accept, recv, sand and close, while the client uses socket, connect, send, recv and close.

Let’s first explore the client side, then we can take a look at the server side.

Client Side: Connecting to a Server via a Socket

In order to develop a client, we only need one socket connected to a remote server. All we have to do is use the connect() system call from the library:

int connect(int sockfd, const struct sockaddr *serv_addr,  socklen_t addrlen); 

Its parameters are quite intuitive:

The function predictably returns 0 in for success and -1 for failure, with errno set to indicate the error.

Once more, all of the data needed for the connection can be found in the structure returned by getaddrinfo() :

int status; int socket_fd; struct addrinfo hints; struct addrinfo *res;  // fill out hints to prepare getaddrinfo() call  status = getaddrinfo("www.example.com", "http", &hints, &res); // check if getaddrinfo() failed  socket_fd = socket(res->ai_family, res->ai_socktype, res->ai_protocol); // check if socket() failed  connect(socket_fd, res->ai_addr, res->ai_addrlen); 

There we go! Our socket is now ready to send and receive data. But before we get to how to do that, let’s first take a look at establishing a connection from the server side.

Server Side: Accepting Client Connections via a Socket

If we want to develop a server, the connection will need to be done in three steps. First, we need to bind our socket to a local address and port. Then, we’ll have to listen to the port to detect incoming connection requests. And finally, we need to accept those client connection requests.

Binding the Socket to the Local Address

The bind() function of allows us to link our socket to a local address and port. Its prototype est practically identical to its twin function connect() , which we examined for the client side:

int bind(int sockfd, const struct sockaddr *addr, socklen_t addrlen); 

Just like connect() , the parameters of bind() are:

As expected, the function returns 0 for success and -1 to indicate an error, with the error code in errno .

Listening via a Socket to Detect Connection Requests

Next, we need to mark the socket as “passive”, meaning it will be used to accept incoming connection requests on the address and port it is bound to. For this, we will use the listen() function, which is also in .

int listen(int sockfd, int backlog); 

The listen() function takes two parameters:

If the call to listen() succeeds, the function returns 0. If it fails, it returns -1 and sets errno accordingly.

Accepting a Client Connection

Finally, we must accept the connection requests from a remote client. When a client connect() s on the port of our machine that our socket is listen() ing to, its request is put in the pending queue. When we accept() the request, that function will return a new file descriptor bound to the client’s address, through which we will be able to communicate with that client. So we will end up with two file descriptors: our initial socket that will continue listening to our port, and a new file descriptor for the client, which we can use to send and receive data.

The prototype of the accept() function of is as follows:

int accept(int sockfd, struct sockaddr *addr, socklen_t *addrlen); 

Let’s take a closer look at its parameters:

The accept() function returns the file descriptor of the new socket, or -1 in case it encounters an error, which it indicates in errno .

Server Socket Example

Let’s create a micro-server which can accept a connection request with calls to the bind() , listen() , and accept() functions:

// server.c - a micro-server that accepts a connection before quitting  #include #include #include #include #include  #include   #define PORT "4242" // our server's port  #define BACKLOG 10 // max number of connection requests in queue  int main(void)   struct addrinfo hints;  struct addrinfo *res;  int socket_fd;  int client_fd;  int status;  // sockaddr_storage is a structure that is not associated to  // a particular family. This allows us to receive either  // an IPv4 or an IPv6 address  struct sockaddr_storage client_addr;  socklen_t addr_size;   // Prepare the address and port for the server socket  memset(&hints, 0, sizeof hints);  hints.ai_family = AF_UNSPEC; // IPv4 or IPv6  hints.ai_socktype = SOCK_STREAM; // TCP  hints.ai_flags = AI_PASSIVE; // Automatically fills IP address   status = getaddrinfo(NULL, PORT, &hints, &res);  if (status != 0)   fprintf(stderr, "getaddrinfo: %s\n", gai_strerror(status));  return (1);  >   // create socket, bind it and listen with it  socket_fd = socket(res->ai_family, res->ai_socktype, res->ai_protocol);  status = bind(socket_fd, res->ai_addr, res->ai_addrlen);  if (status != 0)   fprintf(stderr, "bind: %s\n", strerror(errno));  return (2);  >  listen(socket_fd, BACKLOG);   // Accept incoming connection  addr_size = sizeof client_addr;  client_fd = accept(socket_fd, (struct sockaddr *)&client_addr, &addr_size);  if (client_fd == -1)   fprintf(stderr, "accept: %s\n", strerror(errno));  return (3);  >  printf("New connection! Socket fd: %d, client fd: %d\n", socket_fd, client_fd);   // We are ready to communicate with the client via the client_fd!   return (0); > 
// server.c - a micro-server that accepts a connection before quitting  #include #include #include #include #include  #include   #define PORT "4242" // our server's port  #define BACKLOG 10 // max number of connection requests in queue  int main(void)   struct sockaddr_in sa;  int socket_fd;  int client_fd;  int status;  // sockaddr_storage is a structure that is not associated to  // a particular family. This allows us to receive either  // an IPv4 or an IPv6 address  struct sockaddr_storage client_addr;  socklen_t addr_size;   // Prepare the address and port for the server socket  memset(&sa, 0, sizeof sa);  sa.sin_family = AF_INET; // IPv4 only; use AF_INET6 for IPv6  sa.sin_addr.s_addr = htonl(INADDR_LOOPBACK); // 127.0.0.1, localhost  sa.sin_port = htons(PORT);   // create socket, bind it and listen with it  socket_fd = socket(sa.sin_family, SOCK_STREAM, 0);  status = bind(socket_fd, (struct sockaddr *)&sa, sizeof sa);  if (status != 0)   fprintf(stderr, "bind: %s\n", strerror(errno));  return (2);  >  listen(socket_fd, BACKLOG);   // Accept incoming connection  addr_size = sizeof client_addr;  client_fd = accept(socket_fd, (struct sockaddr *)&client_addr, &addr_size);  if (client_fd == -1)   fprintf(stderr, "accept: %s\n", strerror(errno));  return (3);  >  printf("New connection! Socket fd: %d, client fd: %d\n", socket_fd, client_fd);   // We are ready to communicate with the client via the client_fd!   return (0); > 

When we compile and run this program, we’ll notice it looks to be idling. That’s because the accept() function is blocking the execution, waiting for a connection request. This will become an important factor when dealing with multiple client connections, and we will examine ways to handle this at the end of this article.

To simulate a connection request, we can open a new terminal and run the nc command (netcat), specifying the address of our local machine, localhost (or 127.0.0.1 ), and the port on which our little server is running - in this example, it’s 4242 .

Output of a server program that detects a new connection with a socket.

There we go! Our server has detected and accepted the new connection. and displays the file descriptors of both our server’s listening socket and of the new client socket.

Sending and Receiving Data Through Sockets

It’s not worth establishing a connection from a client to a server or vice-versa if we don’t know how to send and receive data. Clever readers might notice that since sockets are just file descriptors, we could probably just use the read() and write() system calls. And they’d be totally correct! But other functions exist, ones that give us more control over the way in which our data is sent and received…

Sending Data via a Socket

The send() function from the library allows us to send data through a stream socket, which uses a TCP connection.

ssize_t send(int socket, const void *buf, size_t len, int flags); 

Its parameters are as follows:

The send() function returns the number of bytes that were successfully sent. Beware, send() might not be able to send the entire message in one go! This means that we’ll have to be careful to compare the value returned here with the length of the message we wish to send, in order to try sending the rest again if need be. As usual, this function can also return -1 if there is an error, and we can check errno for details.

For datagram sockets, the ones that use the connection-less protocol UDP, there is a similar function: sendto() . In addition to the above parameters, it also takes the destination address in the form of a sockaddr type structure.

Receiving Data via a Socket

Just like the opposite function send() , recv() can be found in . This function allows us to receive data through a socket. Its prototype is as follows:

ssize_t recv(int socket, void *buf, ssize_t len, int flags); 

The parameters of recv() are:

Just like send() , recv() returns the number of bytes it managed to store in the buffer. However, if recv() returns 0, it can only mean one thing: the remote computer has closed the connection. Naturally, the recv() function can also return -1 if it encounters an error, in which case it sets the error code in errno .

There is also another similar function, recvfrom() , for datagram type sockets which use the connection-less protocol UDP. On top of the parameters we supplied for recv() , this function also needs to know the source address from which the message is expected to be sent, in a sockaddr structure.

Closing a Socket Connection

Once we are done sending and receiving data, we will be able to close our socket. Just as any other file descriptor, a socket can be closed with a simple call to close() from . This destroys the file descriptor and prevents any further communication with the socket: the remote side will raise an exception if it attempts to communicate with it by sending or receiving data.

But there is another function that is worth mentioning: shutdown() from . This function gives us more control over how we close our socket. Its prototype is:

int shutdown(int sockfd, int how); 

Its parameters are simple enough: