| .\" ************************************************************************** | 
 | .\" *                                  _   _ ____  _ | 
 | .\" *  Project                     ___| | | |  _ \| | | 
 | .\" *                             / __| | | | |_) | | | 
 | .\" *                            | (__| |_| |  _ <| |___ | 
 | .\" *                             \___|\___/|_| \_\_____| | 
 | .\" * | 
 | .\" * Copyright (C) 1998 - 2014, Daniel Stenberg, <daniel@haxx.se>, et al. | 
 | .\" * | 
 | .\" * This software is licensed as described in the file COPYING, which | 
 | .\" * you should have received as part of this distribution. The terms | 
 | .\" * are also available at http://curl.haxx.se/docs/copyright.html. | 
 | .\" * | 
 | .\" * You may opt to use, copy, modify, merge, publish, distribute and/or sell | 
 | .\" * copies of the Software, and permit persons to whom the Software is | 
 | .\" * furnished to do so, under the terms of the COPYING file. | 
 | .\" * | 
 | .\" * This software is distributed on an "AS IS" basis, WITHOUT WARRANTY OF ANY | 
 | .\" * KIND, either express or implied. | 
 | .\" * | 
 | .\" ************************************************************************** | 
 | .\" | 
 | .TH libcurl-tutorial 3 "2 Aug 2014" "libcurl" "libcurl programming" | 
 | .SH NAME | 
 | libcurl-tutorial \- libcurl programming tutorial | 
 | .SH "Objective" | 
 | This document attempts to describe the general principles and some basic | 
 | approaches to consider when programming with libcurl. The text will focus | 
 | mainly on the C interface but might apply fairly well on other interfaces as | 
 | well as they usually follow the C one pretty closely. | 
 |  | 
 | This document will refer to 'the user' as the person writing the source code | 
 | that uses libcurl. That would probably be you or someone in your position. | 
 | What will be generally referred to as 'the program' will be the collected | 
 | source code that you write that is using libcurl for transfers. The program | 
 | is outside libcurl and libcurl is outside of the program. | 
 |  | 
 | To get more details on all options and functions described herein, please | 
 | refer to their respective man pages. | 
 |  | 
 | .SH "Building" | 
 | There are many different ways to build C programs. This chapter will assume a | 
 | UNIX-style build process. If you use a different build system, you can still | 
 | read this to get general information that may apply to your environment as | 
 | well. | 
 | .IP "Compiling the Program" | 
 | Your compiler needs to know where the libcurl headers are located. Therefore | 
 | you must set your compiler's include path to point to the directory where you | 
 | installed them. The 'curl-config'[3] tool can be used to get this information: | 
 |  | 
 | $ curl-config --cflags | 
 |  | 
 | .IP "Linking the Program with libcurl" | 
 | When having compiled the program, you need to link your object files to create | 
 | a single executable. For that to succeed, you need to link with libcurl and | 
 | possibly also with other libraries that libcurl itself depends on. Like the | 
 | OpenSSL libraries, but even some standard OS libraries may be needed on the | 
 | command line. To figure out which flags to use, once again the 'curl-config' | 
 | tool comes to the rescue: | 
 |  | 
 | $ curl-config --libs | 
 |  | 
 | .IP "SSL or Not" | 
 | libcurl can be built and customized in many ways. One of the things that | 
 | varies from different libraries and builds is the support for SSL-based | 
 | transfers, like HTTPS and FTPS. If a supported SSL library was detected | 
 | properly at build-time, libcurl will be built with SSL support. To figure out | 
 | if an installed libcurl has been built with SSL support enabled, use | 
 | \&'curl-config' like this: | 
 |  | 
 | $ curl-config --feature | 
 |  | 
 | And if SSL is supported, the keyword 'SSL' will be written to stdout, | 
 | possibly together with a few other features that could be either on or off on | 
 | for different libcurls. | 
 |  | 
 | See also the "Features libcurl Provides" further down. | 
 | .IP "autoconf macro" | 
 | When you write your configure script to detect libcurl and setup variables | 
 | accordingly, we offer a prewritten macro that probably does everything you | 
 | need in this area. See docs/libcurl/libcurl.m4 file - it includes docs on how | 
 | to use it. | 
 |  | 
 | .SH "Portable Code in a Portable World" | 
 | The people behind libcurl have put a considerable effort to make libcurl work | 
 | on a large amount of different operating systems and environments. | 
 |  | 
 | You program libcurl the same way on all platforms that libcurl runs on. There | 
 | are only very few minor considerations that differ. If you just make sure to | 
 | write your code portable enough, you may very well create yourself a very | 
 | portable program. libcurl shouldn't stop you from that. | 
 |  | 
 | .SH "Global Preparation" | 
 | The program must initialize some of the libcurl functionality globally. That | 
 | means it should be done exactly once, no matter how many times you intend to | 
 | use the library. Once for your program's entire life time. This is done using | 
 |  | 
 |  curl_global_init() | 
 |  | 
 | and it takes one parameter which is a bit pattern that tells libcurl what to | 
 | initialize. Using \fICURL_GLOBAL_ALL\fP will make it initialize all known | 
 | internal sub modules, and might be a good default option. The current two bits | 
 | that are specified are: | 
 | .RS | 
 | .IP "CURL_GLOBAL_WIN32" | 
 | which only does anything on Windows machines. When used on | 
 | a Windows machine, it'll make libcurl initialize the win32 socket | 
 | stuff. Without having that initialized properly, your program cannot use | 
 | sockets properly. You should only do this once for each application, so if | 
 | your program already does this or of another library in use does it, you | 
 | should not tell libcurl to do this as well. | 
 | .IP CURL_GLOBAL_SSL | 
 | which only does anything on libcurls compiled and built SSL-enabled. On these | 
 | systems, this will make libcurl initialize the SSL library properly for this | 
 | application. This only needs to be done once for each application so if your | 
 | program or another library already does this, this bit should not be needed. | 
 | .RE | 
 |  | 
 | libcurl has a default protection mechanism that detects if | 
 | \fIcurl_global_init(3)\fP hasn't been called by the time | 
 | \fIcurl_easy_perform(3)\fP is called and if that is the case, libcurl runs the | 
 | function itself with a guessed bit pattern. Please note that depending solely | 
 | on this is not considered nice nor very good. | 
 |  | 
 | When the program no longer uses libcurl, it should call | 
 | \fIcurl_global_cleanup(3)\fP, which is the opposite of the init call. It will | 
 | then do the reversed operations to cleanup the resources the | 
 | \fIcurl_global_init(3)\fP call initialized. | 
 |  | 
 | Repeated calls to \fIcurl_global_init(3)\fP and \fIcurl_global_cleanup(3)\fP | 
 | should be avoided. They should only be called once each. | 
 |  | 
 | .SH "Features libcurl Provides" | 
 | It is considered best-practice to determine libcurl features at run-time | 
 | rather than at build-time (if possible of course). By calling | 
 | \fIcurl_version_info(3)\fP and checking out the details of the returned | 
 | struct, your program can figure out exactly what the currently running libcurl | 
 | supports. | 
 |  | 
 | .SH "Handle the Easy libcurl" | 
 | libcurl first introduced the so called easy interface. All operations in the | 
 | easy interface are prefixed with 'curl_easy'. | 
 |  | 
 | Recent libcurl versions also offer the multi interface. More about that | 
 | interface, what it is targeted for and how to use it is detailed in a separate | 
 | chapter further down. You still need to understand the easy interface first, | 
 | so please continue reading for better understanding. | 
 |  | 
 | To use the easy interface, you must first create yourself an easy handle. You | 
 | need one handle for each easy session you want to perform. Basically, you | 
 | should use one handle for every thread you plan to use for transferring. You | 
 | must never share the same handle in multiple threads. | 
 |  | 
 | Get an easy handle with | 
 |  | 
 |  easyhandle = curl_easy_init(); | 
 |  | 
 | It returns an easy handle. Using that you proceed to the next step: setting | 
 | up your preferred actions. A handle is just a logic entity for the upcoming | 
 | transfer or series of transfers. | 
 |  | 
 | You set properties and options for this handle using | 
 | \fIcurl_easy_setopt(3)\fP. They control how the subsequent transfer or | 
 | transfers will be made. Options remain set in the handle until set again to | 
 | something different. Alas, multiple requests using the same handle will use | 
 | the same options. | 
 |  | 
 | Many of the options you set in libcurl are "strings", pointers to data | 
 | terminated with a zero byte. When you set strings with | 
 | \fIcurl_easy_setopt(3)\fP, libcurl makes its own copy so that they don't | 
 | need to be kept around in your application after being set[4]. | 
 |  | 
 | One of the most basic properties to set in the handle is the URL. You set your | 
 | preferred URL to transfer with \fICURLOPT_URL(3)\fP in a manner similar to: | 
 |  | 
 | .nf | 
 |  curl_easy_setopt(handle, CURLOPT_URL, "http://domain.com/"); | 
 | .fi | 
 |  | 
 | Let's assume for a while that you want to receive data as the URL identifies a | 
 | remote resource you want to get here. Since you write a sort of application | 
 | that needs this transfer, I assume that you would like to get the data passed | 
 | to you directly instead of simply getting it passed to stdout. So, you write | 
 | your own function that matches this prototype: | 
 |  | 
 |  size_t write_data(void *buffer, size_t size, size_t nmemb, void *userp); | 
 |  | 
 | You tell libcurl to pass all data to this function by issuing a function | 
 | similar to this: | 
 |  | 
 |  curl_easy_setopt(easyhandle, CURLOPT_WRITEFUNCTION, write_data); | 
 |  | 
 | You can control what data your callback function gets in the fourth argument | 
 | by setting another property: | 
 |  | 
 |  curl_easy_setopt(easyhandle, CURLOPT_WRITEDATA, &internal_struct); | 
 |  | 
 | Using that property, you can easily pass local data between your application | 
 | and the function that gets invoked by libcurl. libcurl itself won't touch the | 
 | data you pass with \fICURLOPT_WRITEDATA(3)\fP. | 
 |  | 
 | libcurl offers its own default internal callback that will take care of the | 
 | data if you don't set the callback with \fICURLOPT_WRITEFUNCTION(3)\fP. It | 
 | will then simply output the received data to stdout. You can have the default | 
 | callback write the data to a different file handle by passing a 'FILE *' to a | 
 | file opened for writing with the \fICURLOPT_WRITEDATA(3)\fP option. | 
 |  | 
 | Now, we need to take a step back and have a deep breath. Here's one of those | 
 | rare platform-dependent nitpicks. Did you spot it? On some platforms[2], | 
 | libcurl won't be able to operate on files opened by the program. Thus, if you | 
 | use the default callback and pass in an open file with | 
 | \fICURLOPT_WRITEDATA(3)\fP, it will crash. You should therefore avoid this to | 
 | make your program run fine virtually everywhere. | 
 |  | 
 | (\fICURLOPT_WRITEDATA(3)\fP was formerly known as \fICURLOPT_FILE\fP. Both | 
 | names still work and do the same thing). | 
 |  | 
 | If you're using libcurl as a win32 DLL, you MUST use the | 
 | \fICURLOPT_WRITEFUNCTION(3)\fP if you set \fICURLOPT_WRITEDATA(3)\fP - or you | 
 | will experience crashes. | 
 |  | 
 | There are of course many more options you can set, and we'll get back to a few | 
 | of them later. Let's instead continue to the actual transfer: | 
 |  | 
 |  success = curl_easy_perform(easyhandle); | 
 |  | 
 | \fIcurl_easy_perform(3)\fP will connect to the remote site, do the necessary | 
 | commands and receive the transfer. Whenever it receives data, it calls the | 
 | callback function we previously set. The function may get one byte at a time, | 
 | or it may get many kilobytes at once. libcurl delivers as much as possible as | 
 | often as possible. Your callback function should return the number of bytes it | 
 | \&"took care of". If that is not the exact same amount of bytes that was | 
 | passed to it, libcurl will abort the operation and return with an error code. | 
 |  | 
 | When the transfer is complete, the function returns a return code that informs | 
 | you if it succeeded in its mission or not. If a return code isn't enough for | 
 | you, you can use the \fICURLOPT_ERRORBUFFER(3)\fP to point libcurl to a buffer | 
 | of yours where it'll store a human readable error message as well. | 
 |  | 
 | If you then want to transfer another file, the handle is ready to be used | 
 | again. Mind you, it is even preferred that you re-use an existing handle if | 
 | you intend to make another transfer. libcurl will then attempt to re-use the | 
 | previous connection. | 
 |  | 
 | For some protocols, downloading a file can involve a complicated process of | 
 | logging in, setting the transfer mode, changing the current directory and | 
 | finally transferring the file data. libcurl takes care of all that | 
 | complication for you. Given simply the URL to a file, libcurl will take care | 
 | of all the details needed to get the file moved from one machine to another. | 
 |  | 
 | .SH "Multi-threading Issues" | 
 | The first basic rule is that you must \fBnever\fP simultaneously share a | 
 | libcurl handle (be it easy or multi or whatever) between multiple | 
 | threads. Only use one handle in one thread at any time. You can pass the | 
 | handles around among threads, but you must never use a single handle from more | 
 | than one thread at any given time. | 
 |  | 
 | libcurl is completely thread safe, except for two issues: signals and SSL/TLS | 
 | handlers. Signals are used for timing out name resolves (during DNS lookup) - | 
 | when built without c-ares support and not on Windows. | 
 |  | 
 | If you are accessing HTTPS or FTPS URLs in a multi-threaded manner, you are | 
 | then of course using the underlying SSL library multi-threaded and those libs | 
 | might have their own requirements on this issue. Basically, you need to | 
 | provide one or two functions to allow it to function properly. For all | 
 | details, see this: | 
 |  | 
 | OpenSSL | 
 |  | 
 |  http://www.openssl.org/docs/crypto/threads.html#DESCRIPTION | 
 |  | 
 | GnuTLS | 
 |  | 
 |  http://gnutls.org/manual/html_node/Thread-safety.html | 
 |  | 
 | NSS | 
 |  | 
 |  is claimed to be thread-safe already without anything required. | 
 |  | 
 | PolarSSL | 
 |  | 
 |  Required actions unknown. | 
 |  | 
 | yassl | 
 |  | 
 |  Required actions unknown. | 
 |  | 
 | axTLS | 
 |  | 
 |  Required actions unknown. | 
 |  | 
 | Secure Transport | 
 |  | 
 |  The engine is fully thread-safe, and no additional steps are required. | 
 |  | 
 | When using multiple threads you should set the \fICURLOPT_NOSIGNAL(3)\fP | 
 | option to 1 for all handles. Everything will or might work fine except that | 
 | timeouts are not honored during the DNS lookup - which you can work around by | 
 | building libcurl with c-ares support. c-ares is a library that provides | 
 | asynchronous name resolves. On some platforms, libcurl simply will not | 
 | function properly multi-threaded unless this option is set. | 
 |  | 
 | Also, note that \fICURLOPT_DNS_USE_GLOBAL_CACHE(3)\fP is not thread-safe. | 
 |  | 
 | .SH "When It Doesn't Work" | 
 | There will always be times when the transfer fails for some reason. You might | 
 | have set the wrong libcurl option or misunderstood what the libcurl option | 
 | actually does, or the remote server might return non-standard replies that | 
 | confuse the library which then confuses your program. | 
 |  | 
 | There's one golden rule when these things occur: set the | 
 | \fICURLOPT_VERBOSE(3)\fP option to 1. It'll cause the library to spew out the | 
 | entire protocol details it sends, some internal info and some received | 
 | protocol data as well (especially when using FTP). If you're using HTTP, | 
 | adding the headers in the received output to study is also a clever way to get | 
 | a better understanding why the server behaves the way it does. Include headers | 
 | in the normal body output with \fICURLOPT_HEADER(3)\fP set 1. | 
 |  | 
 | Of course, there are bugs left. We need to know about them to be able to fix | 
 | them, so we're quite dependent on your bug reports! When you do report | 
 | suspected bugs in libcurl, please include as many details as you possibly can: | 
 | a protocol dump that \fICURLOPT_VERBOSE(3)\fP produces, library version, as | 
 | much as possible of your code that uses libcurl, operating system name and | 
 | version, compiler name and version etc. | 
 |  | 
 | If \fICURLOPT_VERBOSE(3)\fP is not enough, you increase the level of debug | 
 | data your application receive by using the \fICURLOPT_DEBUGFUNCTION(3)\fP. | 
 |  | 
 | Getting some in-depth knowledge about the protocols involved is never wrong, | 
 | and if you're trying to do funny things, you might very well understand | 
 | libcurl and how to use it better if you study the appropriate RFC documents | 
 | at least briefly. | 
 |  | 
 | .SH "Upload Data to a Remote Site" | 
 | libcurl tries to keep a protocol independent approach to most transfers, thus | 
 | uploading to a remote FTP site is very similar to uploading data to a HTTP | 
 | server with a PUT request. | 
 |  | 
 | Of course, first you either create an easy handle or you re-use one existing | 
 | one. Then you set the URL to operate on just like before. This is the remote | 
 | URL, that we now will upload. | 
 |  | 
 | Since we write an application, we most likely want libcurl to get the upload | 
 | data by asking us for it. To make it do that, we set the read callback and | 
 | the custom pointer libcurl will pass to our read callback. The read callback | 
 | should have a prototype similar to: | 
 |  | 
 |  size_t function(char *bufptr, size_t size, size_t nitems, void *userp); | 
 |  | 
 | Where bufptr is the pointer to a buffer we fill in with data to upload and | 
 | size*nitems is the size of the buffer and therefore also the maximum amount | 
 | of data we can return to libcurl in this call. The 'userp' pointer is the | 
 | custom pointer we set to point to a struct of ours to pass private data | 
 | between the application and the callback. | 
 |  | 
 |  curl_easy_setopt(easyhandle, CURLOPT_READFUNCTION, read_function); | 
 |  | 
 |  curl_easy_setopt(easyhandle, CURLOPT_READDATA, &filedata); | 
 |  | 
 | Tell libcurl that we want to upload: | 
 |  | 
 |  curl_easy_setopt(easyhandle, CURLOPT_UPLOAD, 1L); | 
 |  | 
 | A few protocols won't behave properly when uploads are done without any prior | 
 | knowledge of the expected file size. So, set the upload file size using the | 
 | \fICURLOPT_INFILESIZE_LARGE(3)\fP for all known file sizes like this[1]: | 
 |  | 
 | .nf | 
 |  /* in this example, file_size must be an curl_off_t variable */ | 
 |  curl_easy_setopt(easyhandle, CURLOPT_INFILESIZE_LARGE, file_size); | 
 | .fi | 
 |  | 
 | When you call \fIcurl_easy_perform(3)\fP this time, it'll perform all the | 
 | necessary operations and when it has invoked the upload it'll call your | 
 | supplied callback to get the data to upload. The program should return as much | 
 | data as possible in every invoke, as that is likely to make the upload perform | 
 | as fast as possible. The callback should return the number of bytes it wrote | 
 | in the buffer. Returning 0 will signal the end of the upload. | 
 |  | 
 | .SH "Passwords" | 
 | Many protocols use or even require that user name and password are provided | 
 | to be able to download or upload the data of your choice. libcurl offers | 
 | several ways to specify them. | 
 |  | 
 | Most protocols support that you specify the name and password in the URL | 
 | itself. libcurl will detect this and use them accordingly. This is written | 
 | like this: | 
 |  | 
 |  protocol://user:password@example.com/path/ | 
 |  | 
 | If you need any odd letters in your user name or password, you should enter | 
 | them URL encoded, as %XX where XX is a two-digit hexadecimal number. | 
 |  | 
 | libcurl also provides options to set various passwords. The user name and | 
 | password as shown embedded in the URL can instead get set with the | 
 | \fICURLOPT_USERPWD(3)\fP option. The argument passed to libcurl should be a | 
 | char * to a string in the format "user:password". In a manner like this: | 
 |  | 
 |  curl_easy_setopt(easyhandle, CURLOPT_USERPWD, "myname:thesecret"); | 
 |  | 
 | Another case where name and password might be needed at times, is for those | 
 | users who need to authenticate themselves to a proxy they use. libcurl offers | 
 | another option for this, the \fICURLOPT_PROXYUSERPWD(3)\fP. It is used quite | 
 | similar to the \fICURLOPT_USERPWD(3)\fP option like this: | 
 |  | 
 |  curl_easy_setopt(easyhandle, CURLOPT_PROXYUSERPWD, "myname:thesecret"); | 
 |  | 
 | There's a long time UNIX "standard" way of storing ftp user names and | 
 | passwords, namely in the $HOME/.netrc file. The file should be made private | 
 | so that only the user may read it (see also the "Security Considerations" | 
 | chapter), as it might contain the password in plain text. libcurl has the | 
 | ability to use this file to figure out what set of user name and password to | 
 | use for a particular host. As an extension to the normal functionality, | 
 | libcurl also supports this file for non-FTP protocols such as HTTP. To make | 
 | curl use this file, use the \fICURLOPT_NETRC(3)\fP option: | 
 |  | 
 |  curl_easy_setopt(easyhandle, CURLOPT_NETRC, 1L); | 
 |  | 
 | And a very basic example of how such a .netrc file may look like: | 
 |  | 
 | .nf | 
 |  machine myhost.mydomain.com | 
 |  login userlogin | 
 |  password secretword | 
 | .fi | 
 |  | 
 | All these examples have been cases where the password has been optional, or | 
 | at least you could leave it out and have libcurl attempt to do its job | 
 | without it. There are times when the password isn't optional, like when | 
 | you're using an SSL private key for secure transfers. | 
 |  | 
 | To pass the known private key password to libcurl: | 
 |  | 
 |  curl_easy_setopt(easyhandle, CURLOPT_KEYPASSWD, "keypassword"); | 
 |  | 
 | .SH "HTTP Authentication" | 
 | The previous chapter showed how to set user name and password for getting | 
 | URLs that require authentication. When using the HTTP protocol, there are | 
 | many different ways a client can provide those credentials to the server and | 
 | you can control which way libcurl will (attempt to) use them. The default HTTP | 
 | authentication method is called 'Basic', which is sending the name and | 
 | password in clear-text in the HTTP request, base64-encoded. This is insecure. | 
 |  | 
 | At the time of this writing, libcurl can be built to use: Basic, Digest, NTLM, | 
 | Negotiate (SPNEGO). You can tell libcurl which one to use | 
 | with \fICURLOPT_HTTPAUTH(3)\fP as in: | 
 |  | 
 |  curl_easy_setopt(easyhandle, CURLOPT_HTTPAUTH, CURLAUTH_DIGEST); | 
 |  | 
 | And when you send authentication to a proxy, you can also set authentication | 
 | type the same way but instead with \fICURLOPT_PROXYAUTH(3)\fP: | 
 |  | 
 |  curl_easy_setopt(easyhandle, CURLOPT_PROXYAUTH, CURLAUTH_NTLM); | 
 |  | 
 | Both these options allow you to set multiple types (by ORing them together), | 
 | to make libcurl pick the most secure one out of the types the server/proxy | 
 | claims to support. This method does however add a round-trip since libcurl | 
 | must first ask the server what it supports: | 
 |  | 
 |  curl_easy_setopt(easyhandle, CURLOPT_HTTPAUTH, | 
 |  CURLAUTH_DIGEST|CURLAUTH_BASIC); | 
 |  | 
 | For convenience, you can use the 'CURLAUTH_ANY' define (instead of a list | 
 | with specific types) which allows libcurl to use whatever method it wants. | 
 |  | 
 | When asking for multiple types, libcurl will pick the available one it | 
 | considers "best" in its own internal order of preference. | 
 |  | 
 | .SH "HTTP POSTing" | 
 | We get many questions regarding how to issue HTTP POSTs with libcurl the | 
 | proper way. This chapter will thus include examples using both different | 
 | versions of HTTP POST that libcurl supports. | 
 |  | 
 | The first version is the simple POST, the most common version, that most HTML | 
 | pages using the <form> tag uses. We provide a pointer to the data and tell | 
 | libcurl to post it all to the remote site: | 
 |  | 
 | .nf | 
 |     char *data="name=daniel&project=curl"; | 
 |     curl_easy_setopt(easyhandle, CURLOPT_POSTFIELDS, data); | 
 |     curl_easy_setopt(easyhandle, CURLOPT_URL, "http://posthere.com/"); | 
 |  | 
 |     curl_easy_perform(easyhandle); /* post away! */ | 
 | .fi | 
 |  | 
 | Simple enough, huh? Since you set the POST options with the | 
 | \fICURLOPT_POSTFIELDS(3)\fP, this automatically switches the handle to use | 
 | POST in the upcoming request. | 
 |  | 
 | Ok, so what if you want to post binary data that also requires you to set the | 
 | Content-Type: header of the post? Well, binary posts prevent libcurl from | 
 | being able to do strlen() on the data to figure out the size, so therefore we | 
 | must tell libcurl the size of the post data. Setting headers in libcurl | 
 | requests are done in a generic way, by building a list of our own headers and | 
 | then passing that list to libcurl. | 
 |  | 
 | .nf | 
 |  struct curl_slist *headers=NULL; | 
 |  headers = curl_slist_append(headers, "Content-Type: text/xml"); | 
 |  | 
 |  /* post binary data */ | 
 |  curl_easy_setopt(easyhandle, CURLOPT_POSTFIELDS, binaryptr); | 
 |  | 
 |  /* set the size of the postfields data */ | 
 |  curl_easy_setopt(easyhandle, CURLOPT_POSTFIELDSIZE, 23L); | 
 |  | 
 |  /* pass our list of custom made headers */ | 
 |  curl_easy_setopt(easyhandle, CURLOPT_HTTPHEADER, headers); | 
 |  | 
 |  curl_easy_perform(easyhandle); /* post away! */ | 
 |  | 
 |  curl_slist_free_all(headers); /* free the header list */ | 
 | .fi | 
 |  | 
 | While the simple examples above cover the majority of all cases where HTTP | 
 | POST operations are required, they don't do multi-part formposts. Multi-part | 
 | formposts were introduced as a better way to post (possibly large) binary data | 
 | and were first documented in the RFC1867 (updated in RFC2388). They're called | 
 | multi-part because they're built by a chain of parts, each part being a single | 
 | unit of data. Each part has its own name and contents. You can in fact create | 
 | and post a multi-part formpost with the regular libcurl POST support described | 
 | above, but that would require that you build a formpost yourself and provide | 
 | to libcurl. To make that easier, libcurl provides \fIcurl_formadd(3)\fP. Using | 
 | this function, you add parts to the form. When you're done adding parts, you | 
 | post the whole form. | 
 |  | 
 | The following example sets two simple text parts with plain textual contents, | 
 | and then a file with binary contents and uploads the whole thing. | 
 |  | 
 | .nf | 
 |  struct curl_httppost *post=NULL; | 
 |  struct curl_httppost *last=NULL; | 
 |  curl_formadd(&post, &last, | 
 |               CURLFORM_COPYNAME, "name", | 
 |               CURLFORM_COPYCONTENTS, "daniel", CURLFORM_END); | 
 |  curl_formadd(&post, &last, | 
 |               CURLFORM_COPYNAME, "project", | 
 |               CURLFORM_COPYCONTENTS, "curl", CURLFORM_END); | 
 |  curl_formadd(&post, &last, | 
 |               CURLFORM_COPYNAME, "logotype-image", | 
 |               CURLFORM_FILECONTENT, "curl.png", CURLFORM_END); | 
 |  | 
 |  /* Set the form info */ | 
 |  curl_easy_setopt(easyhandle, CURLOPT_HTTPPOST, post); | 
 |  | 
 |  curl_easy_perform(easyhandle); /* post away! */ | 
 |  | 
 |  /* free the post data again */ | 
 |  curl_formfree(post); | 
 | .fi | 
 |  | 
 | Multipart formposts are chains of parts using MIME-style separators and | 
 | headers. It means that each one of these separate parts get a few headers set | 
 | that describe the individual content-type, size etc. To enable your | 
 | application to handicraft this formpost even more, libcurl allows you to | 
 | supply your own set of custom headers to such an individual form part. You can | 
 | of course supply headers to as many parts as you like, but this little example | 
 | will show how you set headers to one specific part when you add that to the | 
 | post handle: | 
 |  | 
 | .nf | 
 |  struct curl_slist *headers=NULL; | 
 |  headers = curl_slist_append(headers, "Content-Type: text/xml"); | 
 |  | 
 |  curl_formadd(&post, &last, | 
 |               CURLFORM_COPYNAME, "logotype-image", | 
 |               CURLFORM_FILECONTENT, "curl.xml", | 
 |               CURLFORM_CONTENTHEADER, headers, | 
 |               CURLFORM_END); | 
 |  | 
 |  curl_easy_perform(easyhandle); /* post away! */ | 
 |  | 
 |  curl_formfree(post); /* free post */ | 
 |  curl_slist_free_all(headers); /* free custom header list */ | 
 | .fi | 
 |  | 
 | Since all options on an easyhandle are "sticky", they remain the same until | 
 | changed even if you do call \fIcurl_easy_perform(3)\fP, you may need to tell | 
 | curl to go back to a plain GET request if you intend to do one as your next | 
 | request. You force an easyhandle to go back to GET by using the | 
 | \fICURLOPT_HTTPGET(3)\fP option: | 
 |  | 
 |  curl_easy_setopt(easyhandle, CURLOPT_HTTPGET, 1L); | 
 |  | 
 | Just setting \fICURLOPT_POSTFIELDS(3)\fP to "" or NULL will *not* stop libcurl | 
 | from doing a POST. It will just make it POST without any data to send! | 
 |  | 
 | .SH "Showing Progress" | 
 |  | 
 | For historical and traditional reasons, libcurl has a built-in progress meter | 
 | that can be switched on and then makes it present a progress meter in your | 
 | terminal. | 
 |  | 
 | Switch on the progress meter by, oddly enough, setting | 
 | \fICURLOPT_NOPROGRESS(3)\fP to zero. This option is set to 1 by default. | 
 |  | 
 | For most applications however, the built-in progress meter is useless and | 
 | what instead is interesting is the ability to specify a progress | 
 | callback. The function pointer you pass to libcurl will then be called on | 
 | irregular intervals with information about the current transfer. | 
 |  | 
 | Set the progress callback by using \fICURLOPT_PROGRESSFUNCTION(3)\fP. And pass | 
 | a pointer to a function that matches this prototype: | 
 |  | 
 | .nf | 
 |  int progress_callback(void *clientp, | 
 |                        double dltotal, | 
 |                        double dlnow, | 
 |                        double ultotal, | 
 |                        double ulnow); | 
 | .fi | 
 |  | 
 | If any of the input arguments is unknown, a 0 will be passed. The first | 
 | argument, the 'clientp' is the pointer you pass to libcurl with | 
 | \fICURLOPT_PROGRESSDATA(3)\fP. libcurl won't touch it. | 
 |  | 
 | .SH "libcurl with C++" | 
 |  | 
 | There's basically only one thing to keep in mind when using C++ instead of C | 
 | when interfacing libcurl: | 
 |  | 
 | The callbacks CANNOT be non-static class member functions | 
 |  | 
 | Example C++ code: | 
 |  | 
 | .nf | 
 | class AClass { | 
 |     static size_t write_data(void *ptr, size_t size, size_t nmemb, | 
 |                              void *ourpointer) | 
 |     { | 
 |       /* do what you want with the data */ | 
 |     } | 
 |  } | 
 | .fi | 
 |  | 
 | .SH "Proxies" | 
 |  | 
 | What "proxy" means according to Merriam-Webster: "a person authorized to act | 
 | for another" but also "the agency, function, or office of a deputy who acts as | 
 | a substitute for another". | 
 |  | 
 | Proxies are exceedingly common these days. Companies often only offer Internet | 
 | access to employees through their proxies. Network clients or user-agents ask | 
 | the proxy for documents, the proxy does the actual request and then it returns | 
 | them. | 
 |  | 
 | libcurl supports SOCKS and HTTP proxies. When a given URL is wanted, libcurl | 
 | will ask the proxy for it instead of trying to connect to the actual host | 
 | identified in the URL. | 
 |  | 
 | If you're using a SOCKS proxy, you may find that libcurl doesn't quite support | 
 | all operations through it. | 
 |  | 
 | For HTTP proxies: the fact that the proxy is a HTTP proxy puts certain | 
 | restrictions on what can actually happen. A requested URL that might not be a | 
 | HTTP URL will be still be passed to the HTTP proxy to deliver back to | 
 | libcurl. This happens transparently, and an application may not need to | 
 | know. I say "may", because at times it is very important to understand that | 
 | all operations over a HTTP proxy use the HTTP protocol. For example, you | 
 | can't invoke your own custom FTP commands or even proper FTP directory | 
 | listings. | 
 |  | 
 | .IP "Proxy Options" | 
 |  | 
 | To tell libcurl to use a proxy at a given port number: | 
 |  | 
 |  curl_easy_setopt(easyhandle, CURLOPT_PROXY, "proxy-host.com:8080"); | 
 |  | 
 | Some proxies require user authentication before allowing a request, and you | 
 | pass that information similar to this: | 
 |  | 
 |  curl_easy_setopt(easyhandle, CURLOPT_PROXYUSERPWD, "user:password"); | 
 |  | 
 | If you want to, you can specify the host name only in the | 
 | \fICURLOPT_PROXY(3)\fP option, and set the port number separately with | 
 | \fICURLOPT_PROXYPORT(3)\fP. | 
 |  | 
 | Tell libcurl what kind of proxy it is with \fICURLOPT_PROXYTYPE(3)\fP (if not, | 
 | it will default to assume a HTTP proxy): | 
 |  | 
 |  curl_easy_setopt(easyhandle, CURLOPT_PROXYTYPE, CURLPROXY_SOCKS4); | 
 |  | 
 | .IP "Environment Variables" | 
 |  | 
 | libcurl automatically checks and uses a set of environment variables to know | 
 | what proxies to use for certain protocols. The names of the variables are | 
 | following an ancient de facto standard and are built up as "[protocol]_proxy" | 
 | (note the lower casing). Which makes the variable \&'http_proxy' checked for a | 
 | name of a proxy to use when the input URL is HTTP. Following the same rule, | 
 | the variable named 'ftp_proxy' is checked for FTP URLs. Again, the proxies are | 
 | always HTTP proxies, the different names of the variables simply allows | 
 | different HTTP proxies to be used. | 
 |  | 
 | The proxy environment variable contents should be in the format | 
 | \&"[protocol://][user:password@]machine[:port]". Where the protocol:// part is | 
 | simply ignored if present (so http://proxy and bluerk://proxy will do the | 
 | same) and the optional port number specifies on which port the proxy operates | 
 | on the host. If not specified, the internal default port number will be used | 
 | and that is most likely *not* the one you would like it to be. | 
 |  | 
 | There are two special environment variables. 'all_proxy' is what sets proxy | 
 | for any URL in case the protocol specific variable wasn't set, and | 
 | \&'no_proxy' defines a list of hosts that should not use a proxy even though a | 
 | variable may say so. If 'no_proxy' is a plain asterisk ("*") it matches all | 
 | hosts. | 
 |  | 
 | To explicitly disable libcurl's checking for and using the proxy environment | 
 | variables, set the proxy name to "" - an empty string - with | 
 | \fICURLOPT_PROXY(3)\fP. | 
 | .IP "SSL and Proxies" | 
 |  | 
 | SSL is for secure point-to-point connections. This involves strong encryption | 
 | and similar things, which effectively makes it impossible for a proxy to | 
 | operate as a "man in between" which the proxy's task is, as previously | 
 | discussed. Instead, the only way to have SSL work over a HTTP proxy is to ask | 
 | the proxy to tunnel trough everything without being able to check or fiddle | 
 | with the traffic. | 
 |  | 
 | Opening an SSL connection over a HTTP proxy is therefor a matter of asking the | 
 | proxy for a straight connection to the target host on a specified port. This | 
 | is made with the HTTP request CONNECT. ("please mr proxy, connect me to that | 
 | remote host"). | 
 |  | 
 | Because of the nature of this operation, where the proxy has no idea what kind | 
 | of data that is passed in and out through this tunnel, this breaks some of the | 
 | very few advantages that come from using a proxy, such as caching.  Many | 
 | organizations prevent this kind of tunneling to other destination port numbers | 
 | than 443 (which is the default HTTPS port number). | 
 |  | 
 | .IP "Tunneling Through Proxy" | 
 | As explained above, tunneling is required for SSL to work and often even | 
 | restricted to the operation intended for SSL; HTTPS. | 
 |  | 
 | This is however not the only time proxy-tunneling might offer benefits to | 
 | you or your application. | 
 |  | 
 | As tunneling opens a direct connection from your application to the remote | 
 | machine, it suddenly also re-introduces the ability to do non-HTTP | 
 | operations over a HTTP proxy. You can in fact use things such as FTP | 
 | upload or FTP custom commands this way. | 
 |  | 
 | Again, this is often prevented by the administrators of proxies and is | 
 | rarely allowed. | 
 |  | 
 | Tell libcurl to use proxy tunneling like this: | 
 |  | 
 |  curl_easy_setopt(easyhandle, CURLOPT_HTTPPROXYTUNNEL, 1L); | 
 |  | 
 | In fact, there might even be times when you want to do plain HTTP | 
 | operations using a tunnel like this, as it then enables you to operate on | 
 | the remote server instead of asking the proxy to do so. libcurl will not | 
 | stand in the way for such innovative actions either! | 
 |  | 
 | .IP "Proxy Auto-Config" | 
 |  | 
 | Netscape first came up with this. It is basically a web page (usually using a | 
 | \&.pac extension) with a Javascript that when executed by the browser with the | 
 | requested URL as input, returns information to the browser on how to connect | 
 | to the URL. The returned information might be "DIRECT" (which means no proxy | 
 | should be used), "PROXY host:port" (to tell the browser where the proxy for | 
 | this particular URL is) or "SOCKS host:port" (to direct the browser to a SOCKS | 
 | proxy). | 
 |  | 
 | libcurl has no means to interpret or evaluate Javascript and thus it doesn't | 
 | support this. If you get yourself in a position where you face this nasty | 
 | invention, the following advice have been mentioned and used in the past: | 
 |  | 
 | - Depending on the Javascript complexity, write up a script that translates it | 
 | to another language and execute that. | 
 |  | 
 | - Read the Javascript code and rewrite the same logic in another language. | 
 |  | 
 | - Implement a Javascript interpreter; people have successfully used the | 
 | Mozilla Javascript engine in the past. | 
 |  | 
 | - Ask your admins to stop this, for a static proxy setup or similar. | 
 |  | 
 | .SH "Persistence Is The Way to Happiness" | 
 |  | 
 | Re-cycling the same easy handle several times when doing multiple requests is | 
 | the way to go. | 
 |  | 
 | After each single \fIcurl_easy_perform(3)\fP operation, libcurl will keep the | 
 | connection alive and open. A subsequent request using the same easy handle to | 
 | the same host might just be able to use the already open connection! This | 
 | reduces network impact a lot. | 
 |  | 
 | Even if the connection is dropped, all connections involving SSL to the same | 
 | host again, will benefit from libcurl's session ID cache that drastically | 
 | reduces re-connection time. | 
 |  | 
 | FTP connections that are kept alive save a lot of time, as the command- | 
 | response round-trips are skipped, and also you don't risk getting blocked | 
 | without permission to login again like on many FTP servers only allowing N | 
 | persons to be logged in at the same time. | 
 |  | 
 | libcurl caches DNS name resolving results, to make lookups of a previously | 
 | looked up name a lot faster. | 
 |  | 
 | Other interesting details that improve performance for subsequent requests | 
 | may also be added in the future. | 
 |  | 
 | Each easy handle will attempt to keep the last few connections alive for a | 
 | while in case they are to be used again. You can set the size of this "cache" | 
 | with the \fICURLOPT_MAXCONNECTS(3)\fP option. Default is 5. There is very | 
 | seldom any point in changing this value, and if you think of changing this it | 
 | is often just a matter of thinking again. | 
 |  | 
 | To force your upcoming request to not use an already existing connection (it | 
 | will even close one first if there happens to be one alive to the same host | 
 | you're about to operate on), you can do that by setting | 
 | \fICURLOPT_FRESH_CONNECT(3)\fP to 1. In a similar spirit, you can also forbid | 
 | the upcoming request to be "lying" around and possibly get re-used after the | 
 | request by setting \fICURLOPT_FORBID_REUSE(3)\fP to 1. | 
 |  | 
 | .SH "HTTP Headers Used by libcurl" | 
 | When you use libcurl to do HTTP requests, it'll pass along a series of headers | 
 | automatically. It might be good for you to know and understand these. You | 
 | can replace or remove them by using the \fICURLOPT_HTTPHEADER(3)\fP option. | 
 |  | 
 | .IP "Host" | 
 | This header is required by HTTP 1.1 and even many 1.0 servers and should be | 
 | the name of the server we want to talk to. This includes the port number if | 
 | anything but default. | 
 |  | 
 | .IP "Accept" | 
 | \&"*/*". | 
 |  | 
 | .IP "Expect" | 
 | When doing POST requests, libcurl sets this header to \&"100-continue" to ask | 
 | the server for an "OK" message before it proceeds with sending the data part | 
 | of the post. If the POSTed data amount is deemed "small", libcurl will not use | 
 | this header. | 
 |  | 
 | .SH "Customizing Operations" | 
 | There is an ongoing development today where more and more protocols are built | 
 | upon HTTP for transport. This has obvious benefits as HTTP is a tested and | 
 | reliable protocol that is widely deployed and has excellent proxy-support. | 
 |  | 
 | When you use one of these protocols, and even when doing other kinds of | 
 | programming you may need to change the traditional HTTP (or FTP or...) | 
 | manners. You may need to change words, headers or various data. | 
 |  | 
 | libcurl is your friend here too. | 
 |  | 
 | .IP CUSTOMREQUEST | 
 | If just changing the actual HTTP request keyword is what you want, like when | 
 | GET, HEAD or POST is not good enough for you, \fICURLOPT_CUSTOMREQUEST(3)\fP | 
 | is there for you. It is very simple to use: | 
 |  | 
 |  curl_easy_setopt(easyhandle, CURLOPT_CUSTOMREQUEST, "MYOWNREQUEST"); | 
 |  | 
 | When using the custom request, you change the request keyword of the actual | 
 | request you are performing. Thus, by default you make a GET request but you can | 
 | also make a POST operation (as described before) and then replace the POST | 
 | keyword if you want to. You're the boss. | 
 |  | 
 | .IP "Modify Headers" | 
 | HTTP-like protocols pass a series of headers to the server when doing the | 
 | request, and you're free to pass any amount of extra headers that you | 
 | think fit. Adding headers is this easy: | 
 |  | 
 | .nf | 
 |  struct curl_slist *headers=NULL; /* init to NULL is important */ | 
 |  | 
 |  headers = curl_slist_append(headers, "Hey-server-hey: how are you?"); | 
 |  headers = curl_slist_append(headers, "X-silly-content: yes"); | 
 |  | 
 |  /* pass our list of custom made headers */ | 
 |  curl_easy_setopt(easyhandle, CURLOPT_HTTPHEADER, headers); | 
 |  | 
 |  curl_easy_perform(easyhandle); /* transfer http */ | 
 |  | 
 |  curl_slist_free_all(headers); /* free the header list */ | 
 | .fi | 
 |  | 
 | \&... and if you think some of the internally generated headers, such as | 
 | Accept: or Host: don't contain the data you want them to contain, you can | 
 | replace them by simply setting them too: | 
 |  | 
 | .nf | 
 |  headers = curl_slist_append(headers, "Accept: Agent-007"); | 
 |  headers = curl_slist_append(headers, "Host: munged.host.line"); | 
 | .fi | 
 |  | 
 | .IP "Delete Headers" | 
 | If you replace an existing header with one with no contents, you will prevent | 
 | the header from being sent. For instance, if you want to completely prevent the | 
 | \&"Accept:" header from being sent, you can disable it with code similar to this: | 
 |  | 
 |  headers = curl_slist_append(headers, "Accept:"); | 
 |  | 
 | Both replacing and canceling internal headers should be done with careful | 
 | consideration and you should be aware that you may violate the HTTP protocol | 
 | when doing so. | 
 |  | 
 | .IP "Enforcing chunked transfer-encoding" | 
 |  | 
 | By making sure a request uses the custom header "Transfer-Encoding: chunked" | 
 | when doing a non-GET HTTP operation, libcurl will switch over to "chunked" | 
 | upload, even though the size of the data to upload might be known. By default, | 
 | libcurl usually switches over to chunked upload automatically if the upload | 
 | data size is unknown. | 
 |  | 
 | .IP "HTTP Version" | 
 |  | 
 | All HTTP requests includes the version number to tell the server which version | 
 | we support. libcurl speaks HTTP 1.1 by default. Some very old servers don't | 
 | like getting 1.1-requests and when dealing with stubborn old things like that, | 
 | you can tell libcurl to use 1.0 instead by doing something like this: | 
 |  | 
 |  curl_easy_setopt(easyhandle, CURLOPT_HTTP_VERSION, CURL_HTTP_VERSION_1_0); | 
 |  | 
 | .IP "FTP Custom Commands" | 
 |  | 
 | Not all protocols are HTTP-like, and thus the above may not help you when | 
 | you want to make, for example, your FTP transfers to behave differently. | 
 |  | 
 | Sending custom commands to a FTP server means that you need to send the | 
 | commands exactly as the FTP server expects them (RFC959 is a good guide | 
 | here), and you can only use commands that work on the control-connection | 
 | alone. All kinds of commands that require data interchange and thus need | 
 | a data-connection must be left to libcurl's own judgement. Also be aware | 
 | that libcurl will do its very best to change directory to the target | 
 | directory before doing any transfer, so if you change directory (with CWD | 
 | or similar) you might confuse libcurl and then it might not attempt to | 
 | transfer the file in the correct remote directory. | 
 |  | 
 | A little example that deletes a given file before an operation: | 
 |  | 
 | .nf | 
 |  headers = curl_slist_append(headers, "DELE file-to-remove"); | 
 |  | 
 |  /* pass the list of custom commands to the handle */ | 
 |  curl_easy_setopt(easyhandle, CURLOPT_QUOTE, headers); | 
 |  | 
 |  curl_easy_perform(easyhandle); /* transfer ftp data! */ | 
 |  | 
 |  curl_slist_free_all(headers); /* free the header list */ | 
 | .fi | 
 |  | 
 | If you would instead want this operation (or chain of operations) to happen | 
 | _after_ the data transfer took place the option to \fIcurl_easy_setopt(3)\fP | 
 | would instead be called \fICURLOPT_POSTQUOTE(3)\fP and used the exact same | 
 | way. | 
 |  | 
 | The custom FTP command will be issued to the server in the same order they are | 
 | added to the list, and if a command gets an error code returned back from the | 
 | server, no more commands will be issued and libcurl will bail out with an | 
 | error code (CURLE_QUOTE_ERROR). Note that if you use \fICURLOPT_QUOTE(3)\fP to | 
 | send commands before a transfer, no transfer will actually take place when a | 
 | quote command has failed. | 
 |  | 
 | If you set the \fICURLOPT_HEADER(3)\fP to 1, you will tell libcurl to get | 
 | information about the target file and output "headers" about it. The headers | 
 | will be in "HTTP-style", looking like they do in HTTP. | 
 |  | 
 | The option to enable headers or to run custom FTP commands may be useful to | 
 | combine with \fICURLOPT_NOBODY(3)\fP. If this option is set, no actual file | 
 | content transfer will be performed. | 
 |  | 
 | .IP "FTP Custom CUSTOMREQUEST" | 
 | If you do want to list the contents of a FTP directory using your own defined | 
 | FTP command, \fICURLOPT_CUSTOMREQUEST(3)\fP will do just that. "NLST" is the | 
 | default one for listing directories but you're free to pass in your idea of a | 
 | good alternative. | 
 |  | 
 | .SH "Cookies Without Chocolate Chips" | 
 | In the HTTP sense, a cookie is a name with an associated value. A server sends | 
 | the name and value to the client, and expects it to get sent back on every | 
 | subsequent request to the server that matches the particular conditions | 
 | set. The conditions include that the domain name and path match and that the | 
 | cookie hasn't become too old. | 
 |  | 
 | In real-world cases, servers send new cookies to replace existing ones to | 
 | update them. Server use cookies to "track" users and to keep "sessions". | 
 |  | 
 | Cookies are sent from server to clients with the header Set-Cookie: and | 
 | they're sent from clients to servers with the Cookie: header. | 
 |  | 
 | To just send whatever cookie you want to a server, you can use | 
 | \fICURLOPT_COOKIE(3)\fP to set a cookie string like this: | 
 |  | 
 |  curl_easy_setopt(easyhandle, CURLOPT_COOKIE, "name1=var1; name2=var2;"); | 
 |  | 
 | In many cases, that is not enough. You might want to dynamically save | 
 | whatever cookies the remote server passes to you, and make sure those cookies | 
 | are then used accordingly on later requests. | 
 |  | 
 | One way to do this, is to save all headers you receive in a plain file and | 
 | when you make a request, you tell libcurl to read the previous headers to | 
 | figure out which cookies to use. Set the header file to read cookies from with | 
 | \fICURLOPT_COOKIEFILE(3)\fP. | 
 |  | 
 | The \fICURLOPT_COOKIEFILE(3)\fP option also automatically enables the cookie | 
 | parser in libcurl. Until the cookie parser is enabled, libcurl will not parse | 
 | or understand incoming cookies and they will just be ignored. However, when | 
 | the parser is enabled the cookies will be understood and the cookies will be | 
 | kept in memory and used properly in subsequent requests when the same handle | 
 | is used. Many times this is enough, and you may not have to save the cookies | 
 | to disk at all. Note that the file you specify to \ICURLOPT_COOKIEFILE(3)\fP | 
 | doesn't have to exist to enable the parser, so a common way to just enable the | 
 | parser and not read any cookies is to use the name of a file you know doesn't | 
 | exist. | 
 |  | 
 | If you would rather use existing cookies that you've previously received with | 
 | your Netscape or Mozilla browsers, you can make libcurl use that cookie file | 
 | as input. The \fICURLOPT_COOKIEFILE(3)\fP is used for that too, as libcurl | 
 | will automatically find out what kind of file it is and act accordingly. | 
 |  | 
 | Perhaps the most advanced cookie operation libcurl offers, is saving the | 
 | entire internal cookie state back into a Netscape/Mozilla formatted cookie | 
 | file. We call that the cookie-jar. When you set a file name with | 
 | \fICURLOPT_COOKIEJAR(3)\fP, that file name will be created and all received | 
 | cookies will be stored in it when \fIcurl_easy_cleanup(3)\fP is called. This | 
 | enables cookies to get passed on properly between multiple handles without any | 
 | information getting lost. | 
 |  | 
 | .SH "FTP Peculiarities We Need" | 
 |  | 
 | FTP transfers use a second TCP/IP connection for the data transfer. This is | 
 | usually a fact you can forget and ignore but at times this fact will come | 
 | back to haunt you. libcurl offers several different ways to customize how the | 
 | second connection is being made. | 
 |  | 
 | libcurl can either connect to the server a second time or tell the server to | 
 | connect back to it. The first option is the default and it is also what works | 
 | best for all the people behind firewalls, NATs or IP-masquerading setups. | 
 | libcurl then tells the server to open up a new port and wait for a second | 
 | connection. This is by default attempted with EPSV first, and if that doesn't | 
 | work it tries PASV instead. (EPSV is an extension to the original FTP spec | 
 | and does not exist nor work on all FTP servers.) | 
 |  | 
 | You can prevent libcurl from first trying the EPSV command by setting | 
 | \fICURLOPT_FTP_USE_EPSV(3)\fP to zero. | 
 |  | 
 | In some cases, you will prefer to have the server connect back to you for the | 
 | second connection. This might be when the server is perhaps behind a firewall | 
 | or something and only allows connections on a single port. libcurl then | 
 | informs the remote server which IP address and port number to connect to. | 
 | This is made with the \fICURLOPT_FTPPORT(3)\fP option. If you set it to "-", | 
 | libcurl will use your system's "default IP address". If you want to use a | 
 | particular IP, you can set the full IP address, a host name to resolve to an | 
 | IP address or even a local network interface name that libcurl will get the IP | 
 | address from. | 
 |  | 
 | When doing the "PORT" approach, libcurl will attempt to use the EPRT and the | 
 | LPRT before trying PORT, as they work with more protocols. You can disable | 
 | this behavior by setting \fICURLOPT_FTP_USE_EPRT(3)\fP to zero. | 
 |  | 
 | .SH "Headers Equal Fun" | 
 |  | 
 | Some protocols provide "headers", meta-data separated from the normal | 
 | data. These headers are by default not included in the normal data stream, but | 
 | you can make them appear in the data stream by setting \fICURLOPT_HEADER(3)\fP | 
 | to 1. | 
 |  | 
 | What might be even more useful, is libcurl's ability to separate the headers | 
 | from the data and thus make the callbacks differ. You can for example set a | 
 | different pointer to pass to the ordinary write callback by setting | 
 | \fICURLOPT_HEADERDATA(3)\fP. | 
 |  | 
 | Or, you can set an entirely separate function to receive the headers, by using | 
 | \fICURLOPT_HEADERFUNCTION(3)\fP. | 
 |  | 
 | The headers are passed to the callback function one by one, and you can | 
 | depend on that fact. It makes it easier for you to add custom header parsers | 
 | etc. | 
 |  | 
 | \&"Headers" for FTP transfers equal all the FTP server responses. They aren't | 
 | actually true headers, but in this case we pretend they are! ;-) | 
 |  | 
 | .SH "Post Transfer Information" | 
 |  | 
 |  [ curl_easy_getinfo ] | 
 |  | 
 | .SH "Security Considerations" | 
 |  | 
 | The libcurl project takes security seriously.  The library is written with | 
 | caution and precautions are taken to mitigate many kinds of risks encountered | 
 | while operating with potentially malicious servers on the Internet.  It is a | 
 | powerful library, however, which allows application writers to make trade offs | 
 | between ease of writing and exposure to potential risky operations.  If | 
 | used the right way, you can use libcurl to transfer data pretty safely. | 
 |  | 
 | Many applications are used in closed networks where users and servers | 
 | can be trusted, but many others are used on arbitrary servers and are fed | 
 | input from potentially untrusted users.  Following is a discussion about | 
 | some risks in the ways in which applications commonly use libcurl and | 
 | potential mitigations of those risks. It is by no means comprehensive, but | 
 | shows classes of attacks that robust applications should consider. The | 
 | Common Weakness Enumeration project at http://cwe.mitre.org/ is a good | 
 | reference for many of these and similar types of weaknesses of which | 
 | application writers should be aware. | 
 |  | 
 | .IP "Command Lines" | 
 | If you use a command line tool (such as curl) that uses libcurl, and you give | 
 | options to the tool on the command line those options can very likely get read | 
 | by other users of your system when they use 'ps' or other tools to list | 
 | currently running processes. | 
 |  | 
 | To avoid this problem, never feed sensitive things to programs using command | 
 | line options. Write them to a protected file and use the \-K option to | 
 | avoid this. | 
 |  | 
 | .IP ".netrc" | 
 | \&.netrc is a pretty handy file/feature that allows you to login quickly and | 
 | automatically to frequently visited sites. The file contains passwords in | 
 | clear text and is a real security risk. In some cases, your .netrc is also | 
 | stored in a home directory that is NFS mounted or used on another network | 
 | based file system, so the clear text password will fly through your network | 
 | every time anyone reads that file! | 
 |  | 
 | To avoid this problem, don't use .netrc files and never store passwords in | 
 | plain text anywhere. | 
 |  | 
 | .IP "Clear Text Passwords" | 
 | Many of the protocols libcurl supports send name and password unencrypted as | 
 | clear text (HTTP Basic authentication, FTP, TELNET etc). It is very easy for | 
 | anyone on your network or a network nearby yours to just fire up a network | 
 | analyzer tool and eavesdrop on your passwords. Don't let the fact that HTTP | 
 | Basic uses base64 encoded passwords fool you. They may not look readable at a | 
 | first glance, but they very easily "deciphered" by anyone within seconds. | 
 |  | 
 | To avoid this problem, use HTTP authentication methods or other protocols that | 
 | don't let snoopers see your password: HTTP with Digest, NTLM or GSS | 
 | authentication, HTTPS, FTPS, SCP, SFTP and FTP-Kerberos are a few examples. | 
 |  | 
 | .IP "Redirects" | 
 | The \fICURLOPT_FOLLOWLOCATION(3)\fP option automatically follows HTTP | 
 | redirects sent by a remote server.  These redirects can refer to any kind of | 
 | URL, not just HTTP.  A redirect to a file: URL would cause the libcurl to read | 
 | (or write) arbitrary files from the local filesystem.  If the application | 
 | returns the data back to the user (as would happen in some kinds of CGI | 
 | scripts), an attacker could leverage this to read otherwise forbidden data | 
 | (e.g.  file://localhost/etc/passwd). | 
 |  | 
 | If authentication credentials are stored in the ~/.netrc file, or Kerberos | 
 | is in use, any other URL type (not just file:) that requires | 
 | authentication is also at risk.  A redirect such as | 
 | ftp://some-internal-server/private-file would then return data even when | 
 | the server is password protected. | 
 |  | 
 | In the same way, if an unencrypted SSH private key has been configured for | 
 | the user running the libcurl application, SCP: or SFTP: URLs could access | 
 | password or private-key protected resources, | 
 | e.g. sftp://user@some-internal-server/etc/passwd | 
 |  | 
 | The \fICURLOPT_REDIR_PROTOCOLS(3)\fP and \fICURLOPT_NETRC(3)\fP options can be | 
 | used to mitigate against this kind of attack. | 
 |  | 
 | A redirect can also specify a location available only on the machine running | 
 | libcurl, including servers hidden behind a firewall from the attacker. | 
 | e.g. http://127.0.0.1/ or http://intranet/delete-stuff.cgi?delete=all or | 
 | tftp://bootp-server/pc-config-data | 
 |  | 
 | Apps can mitigate against this by disabling \fICURLOPT_FOLLOWLOCATION(3)\fP | 
 | and handling redirects itself, sanitizing URLs as necessary. Alternately, an | 
 | app could leave \fICURLOPT_FOLLOWLOCATION(3)\fP enabled but set | 
 | \fICURLOPT_REDIR_PROTOCOLS(3)\fP and install a | 
 | \fICURLOPT_OPENSOCKETFUNCTION(3)\fP callback function in which addresses are | 
 | sanitized before use. | 
 |  | 
 | .IP "Private Resources" | 
 | A user who can control the DNS server of a domain being passed in within a URL | 
 | can change the address of the host to a local, private address which a | 
 | server-side libcurl-using application could then use. e.g. the innocuous URL | 
 | http://fuzzybunnies.example.com/ could actually resolve to the IP address of a | 
 | server behind a firewall, such as 127.0.0.1 or 10.1.2.3.  Apps can mitigate | 
 | against this by setting a \fICURLOPT_OPENSOCKETFUNCTION(3)\fP and checking the | 
 | address before a connection. | 
 |  | 
 | All the malicious scenarios regarding redirected URLs apply just as well to | 
 | non-redirected URLs, if the user is allowed to specify an arbitrary URL that | 
 | could point to a private resource. For example, a web app providing a | 
 | translation service might happily translate file://localhost/etc/passwd and | 
 | display the result.  Apps can mitigate against this with the | 
 | \fICURLOPT_PROTOCOLS(3)\fP option as well as by similar mitigation techniques | 
 | for redirections. | 
 |  | 
 | A malicious FTP server could in response to the PASV command return an IP | 
 | address and port number for a server local to the app running libcurl but | 
 | behind a firewall.  Apps can mitigate against this by using the | 
 | \fICURLOPT_FTP_SKIP_PASV_IP(3)\fP option or \fICURLOPT_FTPPORT(3)\fP. | 
 |  | 
 | .IP "IPv6 Addresses" | 
 | libcurl will normally handle IPv6 addresses transparently and just as easily | 
 | as IPv4 addresses. That means that a sanitizing function that filters out | 
 | addressses like 127.0.0.1 isn't sufficient--the equivalent IPv6 addresses ::1, | 
 | ::, 0:00::0:1, ::127.0.0.1 and ::ffff:7f00:1 supplied somehow by an attacker | 
 | would all bypass a naive filter and could allow access to undesired local | 
 | resources.  IPv6 also has special address blocks like link-local and site-local | 
 | that generally shouldn't be accessed by a server-side libcurl-using | 
 | application.  A poorly-configured firewall installed in a data center, | 
 | organization or server may also be configured to limit IPv4 connections but | 
 | leave IPv6 connections wide open.  In some cases, the CURL_IPRESOLVE_V4 option | 
 | can be used to limit resolved addresses to IPv4 only and bypass these issues. | 
 |  | 
 | .IP Uploads | 
 | When uploading, a redirect can cause a local (or remote) file to be | 
 | overwritten.  Apps must not allow any unsanitized URL to be passed in for | 
 | uploads.  Also, \fICURLOPT_FOLLOWLOCATION(3)\fP should not be used on uploads. | 
 | Instead, the app should handle redirects itself, sanitizing each URL first. | 
 |  | 
 | .IP Authentication | 
 | Use of \fICURLOPT_UNRESTRICTED_AUTH(3)\fP could cause authentication | 
 | information to be sent to an unknown second server.  Apps can mitigate against | 
 | this by disabling \fICURLOPT_FOLLOWLOCATION(3)\fP and handling redirects | 
 | itself, sanitizing where necessary. | 
 |  | 
 | Use of the CURLAUTH_ANY option to \fICURLOPT_HTTPAUTH(3)\fP could result in | 
 | user name and password being sent in clear text to an HTTP server.  Instead, | 
 | use CURLAUTH_ANYSAFE which ensures that the password is encrypted over the | 
 | network, or else fail the request. | 
 |  | 
 | Use of the CURLUSESSL_TRY option to \fICURLOPT_USE_SSL(3)\fP could result in | 
 | user name and password being sent in clear text to an FTP server.  Instead, | 
 | use CURLUSESSL_CONTROL to ensure that an encrypted connection is used or else | 
 | fail the request. | 
 |  | 
 | .IP Cookies | 
 | If cookies are enabled and cached, then a user could craft a URL which | 
 | performs some malicious action to a site whose authentication is already | 
 | stored in a cookie. e.g. http://mail.example.com/delete-stuff.cgi?delete=all | 
 | Apps can mitigate against this by disabling cookies or clearing them | 
 | between requests. | 
 |  | 
 | .IP "Dangerous URLs" | 
 | SCP URLs can contain raw commands within the scp: URL, which is a side effect | 
 | of how the SCP protocol is designed. e.g. | 
 | scp://user:pass@host/a;date >/tmp/test; | 
 | Apps must not allow unsanitized SCP: URLs to be passed in for downloads. | 
 |  | 
 | .IP "Denial of Service" | 
 | A malicious server could cause libcurl to effectively hang by sending a | 
 | trickle of data through, or even no data at all but just keeping the TCP | 
 | connection open.  This could result in a denial-of-service attack. The | 
 | \fICURLOPT_TIMEOUT(3)\fP and/or \fICURLOPT_LOW_SPEED_LIMIT(3)\fP options can | 
 | be used to mitigate against this. | 
 |  | 
 | A malicious server could cause libcurl to effectively hang by starting to send | 
 | data, then severing the connection without cleanly closing the TCP connection. | 
 | The app could install a \fICURLOPT_SOCKOPTFUNCTION(3)\fP callback function and | 
 | set the TCP SO_KEEPALIVE option to mitigate against this.  Setting one of the | 
 | timeout options would also work against this attack. | 
 |  | 
 | A malicious server could cause libcurl to download an infinite amount of data, | 
 | potentially causing all of memory or disk to be filled. Setting the | 
 | \fICURLOPT_MAXFILESIZE_LARGE(3)\fP option is not sufficient to guard against | 
 | this.  Instead, the app should monitor the amount of data received within the | 
 | write or progress callback and abort once the limit is reached. | 
 |  | 
 | A malicious HTTP server could cause an infinite redirection loop, causing a | 
 | denial-of-service. This can be mitigated by using the | 
 | \fICURLOPT_MAXREDIRS(3)\fP option. | 
 |  | 
 | .IP "Arbitrary Headers" | 
 | User-supplied data must be sanitized when used in options like | 
 | \fICURLOPT_USERAGENT(3)\fP, \fICURLOPT_HTTPHEADER(3)\fP, | 
 | \fICURLOPT_POSTFIELDS(3)\fP and others that are used to generate structured | 
 | data. Characters like embedded carriage returns or ampersands could allow the | 
 | user to create additional headers or fields that could cause malicious | 
 | transactions. | 
 |  | 
 | .IP "Server-supplied Names" | 
 | A server can supply data which the application may, in some cases, use as | 
 | a file name. The curl command-line tool does this with --remote-header-name, | 
 | using the Content-disposition: header to generate a file name.  An application | 
 | could also use CURLINFO_EFFECTIVE_URL to generate a file name from a | 
 | server-supplied redirect URL. Special care must be taken to sanitize such | 
 | names to avoid the possibility of a malicious server supplying one like | 
 | "/etc/passwd", "\\autoexec.bat", "prn:" or even ".bashrc". | 
 |  | 
 | .IP "Server Certificates" | 
 | A secure application should never use the \fICURLOPT_SSL_VERIFYPEER(3)\fP | 
 | option to disable certificate validation. There are numerous attacks that are | 
 | enabled by apps that fail to properly validate server TLS/SSL certificates, | 
 | thus enabling a malicious server to spoof a legitimate one. HTTPS without | 
 | validated certificates is potentially as insecure as a plain HTTP connection. | 
 |  | 
 | .IP "Showing What You Do" | 
 | On a related issue, be aware that even in situations like when you have | 
 | problems with libcurl and ask someone for help, everything you reveal in order | 
 | to get best possible help might also impose certain security related | 
 | risks. Host names, user names, paths, operating system specifics, etc. (not to | 
 | mention passwords of course) may in fact be used by intruders to gain | 
 | additional information of a potential target. | 
 |  | 
 | Be sure to limit access to application logs if they could hold private or | 
 | security-related data.  Besides the obvious candidates like user names and | 
 | passwords, things like URLs, cookies or even file names could also hold | 
 | sensitive data. | 
 |  | 
 | To avoid this problem, you must of course use your common sense. Often, you | 
 | can just edit out the sensitive data or just search/replace your true | 
 | information with faked data. | 
 |  | 
 | .SH "Multiple Transfers Using the multi Interface" | 
 |  | 
 | The easy interface as described in detail in this document is a synchronous | 
 | interface that transfers one file at a time and doesn't return until it is | 
 | done. | 
 |  | 
 | The multi interface, on the other hand, allows your program to transfer | 
 | multiple files in both directions at the same time, without forcing you | 
 | to use multiple threads.  The name might make it seem that the multi | 
 | interface is for multi-threaded programs, but the truth is almost the | 
 | reverse.  The multi interface can allow a single-threaded application | 
 | to perform the same kinds of multiple, simultaneous transfers that | 
 | multi-threaded programs can perform.  It allows many of the benefits | 
 | of multi-threaded transfers without the complexity of managing and | 
 | synchronizing many threads. | 
 |  | 
 | To use this interface, you are better off if you first understand the basics | 
 | of how to use the easy interface. The multi interface is simply a way to make | 
 | multiple transfers at the same time by adding up multiple easy handles into | 
 | a "multi stack". | 
 |  | 
 | You create the easy handles you want and you set all the options just like you | 
 | have been told above, and then you create a multi handle with | 
 | \fIcurl_multi_init(3)\fP and add all those easy handles to that multi handle | 
 | with \fIcurl_multi_add_handle(3)\fP. | 
 |  | 
 | When you've added the handles you have for the moment (you can still add new | 
 | ones at any time), you start the transfers by calling | 
 | \fIcurl_multi_perform(3)\fP. | 
 |  | 
 | \fIcurl_multi_perform(3)\fP is asynchronous. It will only execute as little as | 
 | possible and then return back control to your program. It is designed to never | 
 | block. | 
 |  | 
 | The best usage of this interface is when you do a select() on all possible | 
 | file descriptors or sockets to know when to call libcurl again. This also | 
 | makes it easy for you to wait and respond to actions on your own application's | 
 | sockets/handles. You figure out what to select() for by using | 
 | \fIcurl_multi_fdset(3)\fP, that fills in a set of fd_set variables for you | 
 | with the particular file descriptors libcurl uses for the moment. | 
 |  | 
 | When you then call select(), it'll return when one of the file handles signal | 
 | action and you then call \fIcurl_multi_perform(3)\fP to allow libcurl to do | 
 | what it wants to do. Take note that libcurl does also feature some time-out | 
 | code so we advise you to never use very long timeouts on select() before you | 
 | call \fIcurl_multi_perform(3)\fP, which thus should be called unconditionally | 
 | every now and then even if none of its file descriptors have signaled | 
 | ready. Another precaution you should use: always call | 
 | \fIcurl_multi_fdset(3)\fP immediately before the select() call since the | 
 | current set of file descriptors may change when calling a curl function. | 
 |  | 
 | If you want to stop the transfer of one of the easy handles in the stack, you | 
 | can use \fIcurl_multi_remove_handle(3)\fP to remove individual easy | 
 | handles. Remember that easy handles should be \fIcurl_easy_cleanup(3)\fPed. | 
 |  | 
 | When a transfer within the multi stack has finished, the counter of running | 
 | transfers (as filled in by \fIcurl_multi_perform(3)\fP) will decrease. When | 
 | the number reaches zero, all transfers are done. | 
 |  | 
 | \fIcurl_multi_info_read(3)\fP can be used to get information about completed | 
 | transfers. It then returns the CURLcode for each easy transfer, to allow you | 
 | to figure out success on each individual transfer. | 
 |  | 
 | .SH "SSL, Certificates and Other Tricks" | 
 |  | 
 |  [ seeding, passwords, keys, certificates, ENGINE, ca certs ] | 
 |  | 
 | .SH "Sharing Data Between Easy Handles" | 
 | You can share some data between easy handles when the easy interface is used, | 
 | and some data is share automatically when you use the multi interface. | 
 |  | 
 | When you add easy handles to a multi handle, these easy handles will | 
 | automatically share a lot of the data that otherwise would be kept on a | 
 | per-easy handle basis when the easy interface is used. | 
 |  | 
 | The DNS cache is shared between handles within a multi handle, making | 
 | subsequent name resolving faster, and the connection pool that is kept to | 
 | better allow persistent connections and connection re-use is also shared. If | 
 | you're using the easy interface, you can still share these between specific | 
 | easy handles by using the share interface, see \fIlibcurl-share(3)\fP. | 
 |  | 
 | Some things are never shared automatically, not within multi handles, like for | 
 | example cookies so the only way to share that is with the share interface. | 
 | .SH "Footnotes" | 
 |  | 
 | .IP "[1]" | 
 | libcurl 7.10.3 and later have the ability to switch over to chunked | 
 | Transfer-Encoding in cases where HTTP uploads are done with data of an unknown | 
 | size. | 
 | .IP "[2]" | 
 | This happens on Windows machines when libcurl is built and used as a | 
 | DLL. However, you can still do this on Windows if you link with a static | 
 | library. | 
 | .IP "[3]" | 
 | The curl-config tool is generated at build-time (on UNIX-like systems) and | 
 | should be installed with the 'make install' or similar instruction that | 
 | installs the library, header files, man pages etc. | 
 | .IP "[4]" | 
 | This behavior was different in versions before 7.17.0, where strings had to | 
 | remain valid past the end of the \fIcurl_easy_setopt(3)\fP call. |