HTTP协议,RFC阅读笔记

HTTP是一种很简单的请求、响应式协议,客户端发送一个请求、服务器返回一个响应。HTTP 1.1 版本规范由 RFC2616 定义。

HTTP请求定义:
Request = request-Line              ; 1 请求行,必需
*(( general-header
| request-header
| entity-header ) CRLF)        ; 2 三类请求头部,可选
CRLF
[ message-body ]                ; 3 请求数据体,可选
每个请求由请求头部、可选的请求数据体组成,由回车换行分隔
1. 请求头部由请求行、可选的三类实体头部组成,
请求行格式
Request-Line   = Method SP Request-URI SP HTTP-Version CRLF
1.1 方法Method可选值:
Method = “OPTIONS”
| “GET”
| “HEAD”
| “POST”
| “PUT”
| “DELETE”
| “TRACE”
| “CONNECT”
| extension-method
extension-method = token
1.2 URL定义
Request-URI    = “*” | absoluteURI | abs_path | authority
1.3 HTTP版本
OPTIONS * HTTP/1.1
2. 实体头部由通用头部类型、请求头部类型、实体头部类型组成,各节之间也使用回车换行分隔,
每一节格式为   [类型名] [冒号] [空格] [类型值] [回车换行], 各类类型取值参考定义指示小节
2.1 通用头部类型
general-header = Cache-Control
| Connection
| Date
| Pragma
| Trailer
| Transfer-Encoding
2.2 请求头部类型
request-header = Accept
| Accept-Charset
| Accept-Encoding
| Accept-Language
| Authorization
| Expect
| From
| Host
| If-Match
| If-Modified-Since
| If-None-Match
| If-Range
| If-Unmodified-Since
| Max-Forwards
| Proxy-Authorization
| Range
| Referer
| TE
| User-Agent
2.3 实体头部类型
entity-header  = Allow
| Content-Encoding
| Content-Language
| Content-Length
| Content-Location
| Content-MD5
| Content-Range
| Content-Type
| Expires
| Last-Modified
| extension-header
extension-header = message-header
3. 请求数据体定义
message-body = entity-body
| <entity-body encoded as per Transfer-Encoding>

  1. HTTP响应定义:
    Response      = Status-Line             ; 1 响应行,必需
    *(( general-header
    | response-header
    | entity-header ) CRLF)   ;  2 实体头,可选
    CRLF
    [ message-body ]          ;  3 响应数据体,可选
    每个响应由响应头部、可选的响应体组成,由回车换行分隔
    1. 响应头部由响应行、可靠的实体头部组成
    响应行格式
    Status-Line = HTTP-Version SP Status-Code SP Reason-Phrase CRLF
    1.1  状态码,主要有5类
    1xx: Informational 信息提示 – Request received, continuing process
    2xx: Success 成功 – The action was successfully received, understood, and accepted
    3xx: Redirection 被重定向 – Further action must be taken in order to complete the request
    4xx: Client Error 客户端出错 – The request contains bad syntax or cannot be fulfilled
    5xx: Server Error 服务端出错 – The server failed to fulfill an apparently valid request
    2. 实体头部
    2.1 HTTP响应通用头部类型与HTTP请求通用头部类型相同
    2.2 响应头部类型
    response-header = Accept-Ranges
    | Age
    | ETag
    | Location
    | Proxy-Authenticate
    | Retry-After
    | Server
    | Vary
    | WWW-Authenticate
    2.3 HTTP响应实体头部类型与HTTP请求实体头部类型相同
    3. 响应消息体
    entity-body  = *OCTET
    entity-body := Content-Encoding( Content-Type( data )

了解了 HTTP请求、响应消息在TCP数据流中的格式,很容易使用 python 纯 socket 模拟HTTP客户端、HTTP服务器发送接收数据。进而更容易使用高级模块 httplib,httplib2 编写功能更强的脚本。

附一个完整的GET请求header:

GET http://chenpeng.info/html/2592 HTTP/1.1
Host: chenpeng.info
Connection: keep-alive
Cache-Control: no-cache
Pragma: no-cache
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8
User-Agent: Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/36.0.1985.143 Safari/537.36
DNT: 1
Referer: http://chenpeng.info/page/2
Accept-Encoding: gzip,deflate,sdch
Accept-Language: zh-CN,zh;q=0.8,en-US;q=0.6,en;q=0.4,zh-TW;q=0.2
Cookie: wp-settings-time-1=1406773429

 

About 智足者富

http://chenpeng.info

发表评论

电子邮件地址不会被公开。 必填项已用*标注

您可以使用这些HTML标签和属性:

<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>