FUN WITH LINUX

How to program a very simple webserver in C

5 February 2015

Around 10 years ago I played with network-sockets in C. Now I just want to refresh my knowledge a bit and write a very simple webserver. This webserver just displays a simple site. the side is hardcoded, but close enough to see how it works…

Download the Source

git clone https://github.com/whotwagner/SimpleWebserv.git

HTTP-Basics:

Every Webserver is talking the HTTP-protocol. The HTTP-protocol is a pretty simple request/response-protocol. You can try, and connect via telnet to any webserver and send a valid http-request. Lets try to connect to google and ask for the file “robots.txt”:

telnet www.google.com 80
Trying 173.194.113.82...
Connected to www.google.com.
Escape character is '^]'.
GET /robots.txt
HTTP/1.0 200 OK
Vary: Accept-Encoding
Content-Type: text/plain
Last-Modified: Thu, 27 Nov 2014 07:03:43 GMT
Date: Wed, 03 Dec 2014 15:17:59 GMT
Expires: Wed, 03 Dec 2014 15:17:59 GMT
Cache-Control: private, max-age=0
X-Content-Type-Options: nosniff
Server: sffe
X-XSS-Protection: 1; mode=block
Alternate-Protocol: 80:quic,p=0.02

User-agent: *
Disallow: /search
Disallow: /sdch
Disallow: /groups
Disallow: /images
Disallow: /catalogs
Allow: /catalogs/about
Allow: /catalogs/p?
Disallow: /catalogues
Allow: /newsalerts
Disallow: /news
Allow: /news/directory
Disallow: /nwshp
Disallow: /setnewsprefs?
Disallow: /index.html?
Sitemap: http://www.gstatic.com/culturalinstitute/sitemaps/www_google_com_culturalinstitute/sitemap-index.xml
Sitemap: https://www.google.com/work/sitemap.xml
Sitemap: https://www.google.com/intx/sitemap.xml
Sitemap: http://www.google.com/hostednews/sitemap_index.xml
Sitemap: http://www.google.com/maps/views/sitemap.xml
Sitemap: http://www.google.com/sitemaps_webmasters.xml
Sitemap: http://www.google.com/ventures/sitemap_ventures.xml
Sitemap: http://www.gstatic.com/dictionary/static/sitemaps/sitemap_index.xml
Sitemap: http://www.gstatic.com/earth/gallery/sitemaps/sitemap.xml
Sitemap: http://www.gstatic.com/s2/sitemaps/profiles-sitemap.xml
Sitemap: http://www.gstatic.com/trends/websites/sitemaps/sitemapindex.xml
Sitemap: http://www.google.com/adwords/sitemap.xml
Sitemap: http://www.google.com/drive/sitemap.xml
Connection closed by foreign host.

First we sent “GET /robots.txt” to Google, and recieved then the HTTP-response. In this response we find the “HTTP-Code 200 OK” followed by the HTTP-Headers:

HTTP/1.0 200 OK
Vary: Accept-Encoding
Content-Type: text/plain
Last-Modified: Thu, 27 Nov 2014 07:03:43 GMT
Date: Wed, 03 Dec 2014 15:17:59 GMT
Expires: Wed, 03 Dec 2014 15:17:59 GMT
Cache-Control: private, max-age=0
X-Content-Type-Options: nosniff
Server: sffe
X-XSS-Protection: 1; mode=block
Alternate-Protocol: 80:quic,p=0.02

The rest of the output is the content of the robots.txt-file. This was a very simple HTTP-Request without special headers. It is possible to send a multiline-http-request with headers:

telnet www.google.com 80
Trying 173.194.113.84...
Connected to www.google.com.
Escape character is '^]'.
GET /robots.txt HTTP/1.1
User-Agent: Telnet

HTTP/1.1 200 OK
Vary: Accept-Encoding
Content-Type: text/plain
Last-Modified: Thu, 27 Nov 2014 07:03:43 GMT
Date: Wed, 03 Dec 2014 15:27:29 GMT
Expires: Wed, 03 Dec 2014 15:27:29 GMT
Cache-Control: private, max-age=0
X-Content-Type-Options: nosniff
Server: sffe
X-XSS-Protection: 1; mode=block
Alternate-Protocol: 80:quic,p=0.02
Transfer-Encoding: chunked

1e5f
User-agent: *
Disallow: /search
Disallow: /sdch
Disallow: /groups
Disallow: /images
Disallow: /catalogs
Allow: /catalogs/about
Allow: /catalogs/p?
Disallow: /catalogues
Allow: /newsalerts
Disallow: /news
Allow: /news/directory
Disallow: /nwshp
Disallow: /setnewsprefs?
Disallow: /index.html?
Sitemap: http://www.gstatic.com/culturalinstitute/sitemaps/www_google_com_culturalinstitute/sitemap-index.xml
Sitemap: https://www.google.com/work/sitemap.xml
Sitemap: https://www.google.com/intx/sitemap.xml
Sitemap: http://www.google.com/hostednews/sitemap_index.xml
Sitemap: http://www.google.com/maps/views/sitemap.xml
Sitemap: http://www.google.com/sitemaps_webmasters.xml
Sitemap: http://www.google.com/ventures/sitemap_ventures.xml
Sitemap: http://www.gstatic.com/dictionary/static/sitemaps/sitemap_index.xml
Sitemap: http://www.gstatic.com/earth/gallery/sitemaps/sitemap.xml
Sitemap: http://www.gstatic.com/s2/sitemaps/profiles-sitemap.xml
Sitemap: http://www.gstatic.com/trends/websites/sitemaps/sitemapindex.xml
Sitemap: http://www.google.com/adwords/sitemap.xml
Sitemap: http://www.google.com/drive/sitemap.xml


Connection closed by foreign host.

In this example we sent a multiline request containing the User-Agent-Header:

GET /robots.txt HTTP/1.1
User-Agent: Telnet

This means, our webserver has to handle simple requests as well as more complex multiline requests.

Sockets

In order to have a functional Webserver we have to create a network-socket and make it listening for connections. If a client connects it has to accept the connection and start the http-routine. It would be nice if our webserver would be able to handle more than just one client at once, so we have to think about this too..

So first let’s create a TCP-Socket:

sockfd = socket(AF_INET,SOCK_STREAM,0);

This socket we want to bind on a specific port:

memset(&servaddr,0,sizeof(servaddr));
servaddr.sin_family = AF_INET;
servaddr.sin_addr.s_addr = htonl(INADDR_ANY); 
servaddr.sin_port = htons(SERV_PORT); 

/* bind sockt to address + port */
if(bind(sockfd, (struct sockaddr*)&servaddr,sizeof(servaddr)) != 0)
{
        perror("bind() failed..");
        close(sockfd);
        exit(EXIT_FAILURE);
}

..and start listening:

listen(sockfd,LISTENQ);

Now it starts to get a bit tricky. Our server has to accept the connections of the clients, then it has to fork into a subprocess and this subprocess will do the http-stuff. This sounds not that complicated, but we have to take care of our child-processes too, otherwise we might have some zombie-processes in some cases.

Let’s start with the child-handler:

void sigchld_handler(int signo)
{
    int status;
    pid_t pid;

    /*
       -1 means that we wait until the first process is terminated
       WNOHANG tells the kernel not to block if there are no terminated
       child-processes.
     */
    while( (pid = waitpid(-1,&status,WNOHANG)) > 0)
    {
        clientcount--;
        printf("%i exited with %i\n", pid, WEXITSTATUS(status));
    }

    return;
}

Now we connect this child-handler with our signal, so that the parent is waiting until the child-processes are terminated:

memset(&sa, 0, sizeof(sa));
sa.sa_handler = sigchld_handler;
sigaction(SIGCHLD, &sa, NULL);

Now let’s complete our server-socket-construct:

while(1)
        {
                connfd = accept(sockfd, (struct sockaddr *)&cliaddr, &clilen);
                childpid = fork();
                if(childpid < 0)
                {
                        perror("fork() failed");
                        exit(EXIT_FAILURE);
                }

                /* let's start our child-subprocess */
                if( childpid == 0 )
                {
                        close(sockfd);
                        exit(handle_client(connfd));
                }

                /* continue our server-routine */
                printf("Client has PID %i\n",childpid);

                close(connfd);
       }

Where to go from here..

In this article i just showed the most important routines to implement a simple Web-Server in C. Of course there is much more work for a working Webserver, but I think I gave a glimpse of how it might works. At the following link there is the full source to this example(with a lot of comments): https://github.com/whotwagner/SimpleWebserv

[ Programming  C  Sockets  Downloads  ]
Except where otherwise noted, content on this site is licensed under a Creative Commons Attribution 3.0 Unported License.

Copyright 2015-present Hoti