位置: IT常识 - 正文

6.824 Lab 1: A simple web proxy

编辑:rootadmin
6.824 Lab 1: A simple web proxy6.824 - Spring 20046.824 Lab 1: A simple web proxyDue: Tuesday, Febru

推荐整理分享6.824 Lab 1: A simple web proxy,希望有所帮助,仅作参考,欢迎阅读内容。

文章相关热门搜索词:,内容如对您有帮助,希望把文章链接给更多的朋友!

6.824 - Spring 20046.824 Lab 1: A simple web proxyDue: Tuesday, February 10th, 1:00pm.Introduction

Please read Getting started with 6.824 labs before starting this assignment. You will also need Using TCP through sockets at a later stage.

If you have questions, please first read Office hours and asking questions. After you have done that, you can send e-mail to 6.824-staff@pdos.lcs.mit.edu.

In this lab assignment you will write a simple web proxy. A web proxy is a program that reads a request from a browser, forwards that request to a web server, reads the reply from the web server, and forwards the reply back to the browser. People typically use web proxies to cache pages for better performance, to modify web pages in transit (e.g. to remove annoying advertisements), or for weak anonymity.

You'll be writing a web proxy to learn about how to structure servers. For this assignment you'll start simple; in particular your proxy need only handle a single connection at a time. It should accept a new connection from a browser, completely handle the request and response for that browser, and then start work on the next connection. (A real web proxy would be able to handle many connections concurrently.)

In this handout, we use client to mean an application program that establishes connections for the purpose of sending requests[3], typically a web browser (e.g., lynx or Netscape). We use server to mean an application program that accepts connections in order to service requests by sending back responses (e.g., the Apache web server)[1]. Note that a proxy acts as both a client and server. Moreover, a proxy could communicate with other proxies (e.g., a cache hierarchy).

Design Requirements

Your proxy will speak a subset of the HTTP/1.0 protocol, which is defined in RFC 1945. You're only responsible for a small subset of HTTP/1.0, so you can ignore most of the spec. You should make sure your proxy satisfies these requirements:

GET requests work.Images/Binary files are transferred correctly.Your webproxy should properly handle Full-Requests (RFC 1945, Section 4.1) up to, and including, 65535 bytes. You should close the connection if a Full-Request is larger than that.You must support URLs with a numerical IP address instead of the server name (e.g. http://18.181.0.31/).You are not allowed to use fork().You may not allocate more than 100MB of memory.You can not have more than 32 open file descriptors.Your proxy should correctly service each request if possible. If an error occurs, and it is possible for the proxy to continue with subsequent requests, it should close the connection and then proceed to the next request. If an error occurs from which the proxy cannot reasonably recover, the proxy should print an error message on the standard error and call exit(1). There are not many non-recoverable errors; perhaps the only ones are failure of the initial socket(), bind(), listen() calls, or a call to accept(). The proxy should never dump core except in situations beyond your control (e.g. a hardware or operating system failure).

You do not have to worry about correct implementation of any of the following features; just ignore them as best you can:

POST or HEAD requests. URLs of any type other than http. HTTP-headers (RFC 1945, Section 4.2).

If your browser can fetch pages and images through your proxy, and your proxy passes our tester (see below), you're done.

HTTP example without a web proxy

HTTP is a request/response protocol that runs over TCP. A client opens a connection to a web server and sends a request for a file; the server responds with some status information and the file contents, and then closes the connection.

You can try out HTTP yourself:

% telnet web.mit.edu 80

This connects to web.mit.edu on port 80, the default port for HTTP (web) servers.

Then type

GET / HTTP/1.0

followed by two carriage returns. This ends the header section of the request. The server locates the web page and sends it back. You should see it on your screen.

To form the path to the file to be retrieved on a server, the client takes everything after the machine name. For example, http://web.mit.edu/resources.html means we should ask for the file /resources.html. If you see a URL with nothing after the machine name and port, then / is assumed---the server figures out what page to return when just given /. Typically this default page is index.html or home.html.

On most servers, the HTTP server lives on port 80. However, one can specify a different port number in the URL. For example, typing http://web.mit.edu:2206 in your browser will tell it to find a web server on port 2206 on web.mit.edu. (No, this doesn't work for this address.)

HTTP (request) example with a web proxy

Before you can do this example, you need to tell your web browser to use a web proxy. This explanation assumes you are running Mozilla, but things should be remarkably similar for Netscape. Choose ``Edit'' ---> ``Preferences''. Then choose ``Advanced'' ---> ``Proxies''. Click on ``Manual proxy configuration''. Now set the ``HTTP proxy'' to speakeasy-mit-ron.lcs.mit.edu and port 3128. Mozilla will now send all HTTP request to this web proxy rather than directly to web servers.

Lynx---a poor man's browser---can be told to use this web proxy by setting the environment variable http_proxy to speakeasy-mit-ron.lcs.mit.edu:3128.

Now to the real stuff.

You can use nc to peek at HTTP requests that a browser sends to a web proxy. nc lets you read and write data across network connections using UDP or TCP[10]. The class machines have nc installed.

First we'll examine the requests that a browser sends to the proxy. We'll use nc to listen on a port and direct our web browser (Lynx) to use that host and port as a proxy. We're going to let nc listen on port 8888 and tell Lynx to use a web proxy on port 8888.

% nc -lp 8888

This tells nc to listen on port 8888. Chances are that you will have to choose a different port number than 8888 because someone else may be using that port. Choose a number greater than 1024, less than 65536. Now try, on the same machine, to retrieve a web page port 8888 as a proxy:

% env http_proxy=http://localhost:8888/ lynx -source http://www.yahoo.com

This tells Lynx to fetch http://www.yahoo.com using a web proxy on port 8888, which happens to be our spy friend nc.

Netcat neatly prints out the request headers that Lynx sent:

% nc -lp 8888GET http://www.yahoo.com/ HTTP/1.0Host: www.yahoo.comAccept: text/html, text/plain, application/vnd.rn-rn_music_package, application/x-freeamp-theme, audio/mp3, audio/mpeg, audio/mpegurl, audio/scpls, audio/x-mp3, audio/x-mpeg, audio/x-mpegurl, audio/x-scpls, audio/mod, image/*, video/mpeg, video/*Accept: application/pgp, application/pdf, application/postscript, message/partial, message/external-body, x-be2, application/andrew-inset, text/richtext, text/enriched, x-sun-attachment, audio-file, postscript-file, default, mail-fileAccept: sun-deskset-message, application/x-metamail-patch, application/msword, text/sgml, */*;q=0.01Accept-Encoding: gzip, compressAccept-Language: enUser-Agent: Lynx/2.8.4rel.1 libwww-FM/2.14 SSL-MM/1.4.1 OpenSSL/0.9.6b

The GET request on the first tells the proxy to get file http://www.yahoo.com using HTTP version 1.0. Notice how this request is quite different from the example without a web proxy! The protocol and machine name (http://www.yahoo.com) are now part of the request. In the previous example this part was omitted. Look in RFC 1945 for details on the remaining lines. (It's effective reading material if you really can't sleep and Dostoevsky didn't do the trick.)

HTTP (reply) example with a web proxy

The previous example shows the HTTP request. Now we'll try to see what a real web proxy (speakeasy-mit-ron.lcs.mit.edu port 3128) sends to a web server. To achieve this we use nc to be a fake web server. Start the ``fake server'' on anguish.lcs.mit.edu with the following command:

% nc -lp 88886.824 Lab 1: A simple web proxy

Again, you may have to choose a different number if 8888 turns out to be taken by someone else.

% env http_proxy=http://speakeasy-mit-ron.lcs.mit.edu:3128/ lynx -source http://anguish.lcs.mit.edu:8888

Needless to say, you should replace 8888 by whatever port you chose to run nc on. nc will show the following request:

% nc -lp 8888GET / HTTP/1.0Accept: text/html, text/plain, audio/x-pn-realaudio, audio/vnd.rn-realaudio, application/smil, text/vnd.rn-realtext, video/vnd.rn-realvideo, image/vnd.rn-realflash, application/x-shockwave-flash2-preview, application/sdp, application/x-sdpAccept: application/vnd.rn-realmedia, image/vnd.rn-realpix, audio/wav, audio/x-wav, audio/x-pn-wav, audio/x-pn-windows-acm, audio/basic, audio/x-pn-au, audio/aiff, audio/x-aiff, audio/x-pn-aiff, text/sgml, video/mpeg, image/jpeg, image/tiffAccept: image/x-rgb, image/png, image/x-xbitmap, image/x-xbm, image/gif, application/postscript, */*;q=0.01Accept-Encoding: gzip, compressAccept-Language: enUser-Agent: Lynx/2.8.4rel.1 libwww-FM/2.14Host: anguish.lcs.mit.edu:8888X-RAN-Loopstop: trueX-RAN-Loopstop: trueVia: 1.0 speakeasy.ron.lcs.mit.edu:3128 (squid/2.5.STABLE2), 1.0 speakeasy.ron.lcs.mit.edu:3148 (squid/2.5.STABLE4), 1.0 nyu.ron.lcs.mit.edu:3128 (squid/2.5.STABLE4)X-Forwarded-For: 18.26.4.9, 127.0.0.1, unknownCache-Control: max-age=259200Connection: keep-alive

Notice how the web proxy stripped away the http://anguish.lcs.mit.edu:8888 part from the request!

Your web proxy

Your web proxy will have to translate between requests that the client makes (the one that starts with ``GET http://machinename'') into requests that the server understands. So far for the bad news. The good news is that we provide you with some helpful code that will make this very easy to do.

Your web proxy will listen on a port other than port 80, so as to avoid conflicts with regular web servers.

Once the request line has been received, the web proxy should continue reading the input from the client until it encounters a blank line. The proxy should then fetch the URL from the appropriate server, forward the response back to the client, and close the connection. The proxy should forward response data as it arrives, rather than buffering the entire response; this allows the proxy to handle huge responses without running out of memory.

Your web proxy has to support the GET method only [3]. A GET method takes two arguments: the file to be retrieved and the HTTP version. Additional headers may follow the request.

Getting Started

We have provided a skeleton webproxy directory. It is available at http://pdos.lcs.mit.edu/6.824/labs/webproxy1.tar.gz. The following sequence of commands should yield a compiled version of the server you should extend to pass the tests.

% wget http://pdos.lcs.mit.edu/6.824/labs/webproxy1.tar.gz% tar xzvf webproxy1.tar.gz% cd webproxy1% gmake

The tarball contains http.C, http.h, Makefile, webproxy1.C and webproxy1-test.C. The first two files will help you parse HTTP requests. The Makefile is, as its meaningful name implies, a Makefile. Webproxy1.C is a pretty useless web server that, nonetheless, should help you on your way. webproxy1-test.C is our testing program which checks your program for correctness.

http.C and http.h : a HTTP parser

We have provided a parser for proxy-style HTTP requests. It is implemented in the files http.C and http.h that are included in the tarball.

http.h defines the class httpreq that inherits from the class httpparse (if you are unfamiliar with C++ inheritance, consult the Stroustrup C++ language guide referenced in the course information page. Don't drop this book on someone's face. It's a pretty hefty book.)

To parse a request, first create a httpreq object. Then, parse the (potentially incomplete) HTTP request by feeding it to int parse (char *buf, ssize_t len) until it returns 1, indicating that the headers are complete. buf should be the buffer that contains the (potentially incomplete) HTTP request. len is the length of the HTTP request fragment in buf. Notice that parse needs to see the whole request you have read so far.

parse returns 1 if the HTTP request is complete, 0 if it needs more data to complete, or -1 on a parse error. parse does not modify the contents of buf. Once parse returns 1, you can call---amongst others---the following methods on the calling httpreq.

char* method() The 'type' of request (POST, GET, HEAD)char* host() The destination hostshort port() The destination portchar* path() The filename part of the requested URLchar* url() The requested URL

Here's a simple program that illustrates the use of httpreq.

#include <stdio.h>#include "http.h"intmain(){ httpreq *r = new httpreq(); char buf[512]; int ret; // incomplete header strcpy(buf, "GET http://web.mit.edu/index.html"); ret = r->parse(buf, strlen(buf)); printf("ret %d file %s\n", ret, ret == 1 ? r->path() : "(none)"); // complete header strcat(buf, " HTTP/1.0\r\n\r\n"); ret = r->parse(buf, strlen(buf)); printf("ret %d file %s\n", ret, ret == 1 ? r->path() : "(none)"); delete r; exit(0);}Documentation

You may want to read Using TCP through sockets to learn about socket programming in C/C++. Also, take a look at the references at the bottom of this page.

Running and testing the proxy

Your proxy program should take exactly one argument, a port number on which to listen. For example, to run the proxy on port 2000:

% ./webproxy1 2000

As a first test of the proxy you should attempt to use it to browse the web. Set up your web browswer to use one of the class machines running your proxy as a proxy and experiment with a variety of different pages.

When you think your proxy is ready, you can run it against the test program webproxy1-test, our tester. Run the tester with your proxy as an argument:

% ./webproxy1-test ./webproxy1

Note that this may take several minutes to complete. The test program runs the following tests:

Ordinary fetch

This test is the "normal case". We send a normal HTTP 1.0 GET request and expect the correct web page.

Split request

This tests splits the HTTP request in two chunks. The first chunk contains a partial HTTP request. The second chunk completes the first after which the tester expects the correct web page contents to come back.

Large request

The tester does a request of exactly 65535 bytes.

Large response

The tester fetches a web page larger than the maximum amount of memory available to your web proxy.

Zero-size response

The tester fetches a web page without a body.

Recover after bad connect

The tester sends a request with a URL that specifies a false port. Your proxy will attempt to make a connection to a bogus port. Soon thereafter, the tester tries to fetch a valid page to see if your proxy is still doing ok.

Malformed request

The tester sends an HTTP request that is not syntactically correct. After that, it tries to fetch a valid page to see if it your proxy is still doing ok.

Premature client close()

The tester sends a partial HTTP request and then closes the connection. After that, it tries to fetch a valid page to see if it your proxy is still doing ok.

Infinitely long request

The tester swamps your proxy with a request larger than 65535 bytes. The tester expects your proxy to close the connection. After that, it tries to fetch a valid page to see if it your proxy is still doing ok.

Stress test

The tester stress tests your web proxy with a ruthless combination of ordinary fetches, split requests, malformed requests, and large responses. This may expose memory leaks, unclosed connections, and random other bugs.

Collaboration policy

You must write all the code you hand in for the programming assignments, except for code that we give you as part of the assigment. You are not allowed to look at anyone else's solution (and you're not allowed to look at solutions from previous years). You may discuss the assignments with other students, but you may not look at or copy each others' code.

Handin procedure

You should hand in a gzipped tarball webproxy1-handin.tgz produced by gmake dist. Copy this file to ~/handin/webproxy1-handin.tgz. Do not make this file world readable! We will use the first copy of the file that we can find after the deadline---we try every few minutes. Don't bother to copy a new version over the old one hoping that we will use it instead. We won't.

References1Apache Web Proxy, http://www.apache.org/docs/mod/mod_proxy.html.2T. Berners-Lee, et al. RFC 1945: Hypertext Transfer Protocol - HTTP/1.0, May 1996.3CERN Web Proxy, http://www.w3.org/Daemon/User/Proxies/Proxies.html.4Netcat. http://www.atstake.com/research/tools/nc110.txt.
本文链接地址:https://www.jiuchutong.com/zhishi/303719.html 转载请保留说明!

上一篇:CSS Houdini:用浏览器引擎实现高级CSS效果(css代码怎么在浏览器运行)

下一篇:帝国cms如何投稿(帝国cms怎么上传图片)

  • 浏览别人的抖音作品他会不会知道(浏览别人的抖音作品怎样不让他知道)

    浏览别人的抖音作品他会不会知道(浏览别人的抖音作品怎样不让他知道)

  • 华为手机怎么屏幕录制(华为手机怎么屏幕变成黑白色了)

    华为手机怎么屏幕录制(华为手机怎么屏幕变成黑白色了)

  • 华为悬浮球自定义(华为悬浮球自定义功能)

    华为悬浮球自定义(华为悬浮球自定义功能)

  • 小红书如何保存图片到相册(小红书如何保存图片不带水印)

    小红书如何保存图片到相册(小红书如何保存图片不带水印)

  • 5v2a和5v3a可以通用吗(5v2a和5v5a可以通用吗)

    5v2a和5v3a可以通用吗(5v2a和5v5a可以通用吗)

  • 苹果手机怎么删除云备份里面的内容(苹果手机怎么删除app和卸载app的区别)

    苹果手机怎么删除云备份里面的内容(苹果手机怎么删除app和卸载app的区别)

  • 步步高家教机s5与s5c有什么区别(步步高家教机s5参数配置)

    步步高家教机s5与s5c有什么区别(步步高家教机s5参数配置)

  • 微信扫码进群人数限制(微信扫码进群人数限制怎么解除)

    微信扫码进群人数限制(微信扫码进群人数限制怎么解除)

  • 怎么在b站下载视频到手机(怎么在b站下载视频)

    怎么在b站下载视频到手机(怎么在b站下载视频)

  • 电脑钉钉连麦对方听不到声音(电脑钉钉连麦对方听不到我的声音是什么原因?)

    电脑钉钉连麦对方听不到声音(电脑钉钉连麦对方听不到我的声音是什么原因?)

  • 苹果x和11pro一样大吗(苹果x跟苹果11pro一样大吗)

    苹果x和11pro一样大吗(苹果x跟苹果11pro一样大吗)

  • 抖音粉丝勋章可以弄不显示么(抖音粉丝勋章可以消除吗?)

    抖音粉丝勋章可以弄不显示么(抖音粉丝勋章可以消除吗?)

  • word文档如何限制复制(word文档如何限制上传云端)

    word文档如何限制复制(word文档如何限制上传云端)

  • 手机剪映导出的视频模糊(手机剪映导出的格式是什么)

    手机剪映导出的视频模糊(手机剪映导出的格式是什么)

  • 猜你喜欢直播怎么关闭(播放猜你喜欢)

    猜你喜欢直播怎么关闭(播放猜你喜欢)

  • 魅族16s屏幕分辨率是多少(魅族16th屏幕分辨率可以更改吗?)

    魅族16s屏幕分辨率是多少(魅族16th屏幕分辨率可以更改吗?)

  • oppor9手机有插件吗(oppor9的插件在哪里)

    oppor9手机有插件吗(oppor9的插件在哪里)

  • 低数据模式省电吗(低数据模式开启会影响网速吗)

    低数据模式省电吗(低数据模式开启会影响网速吗)

  • 抖音删了评论对方能看到嘛(抖音删评论对方会收到吗)

    抖音删了评论对方能看到嘛(抖音删评论对方会收到吗)

  • 小米9se具备防水吗(小米9se是否防水)

    小米9se具备防水吗(小米9se是否防水)

  • 抖音短视频怎么看完整版(抖音短视频怎么挂商品链接)

    抖音短视频怎么看完整版(抖音短视频怎么挂商品链接)

  • 顺风车几号恢复(顺风车什么时候可以恢复正常)

    顺风车几号恢复(顺风车什么时候可以恢复正常)

  • notability如何分屏(notability如何分级列表)

    notability如何分屏(notability如何分级列表)

  • 安卓手机微博视频缓存在哪里(安卓手机微博视频怎么保存到手机相册里)

    安卓手机微博视频缓存在哪里(安卓手机微博视频怎么保存到手机相册里)

  • 快手隐私用户是啥意思(快手隐私用户是不是把你拉黑了)

    快手隐私用户是啥意思(快手隐私用户是不是把你拉黑了)

  • [YOLOv7/YOLOv5系列算法改进NO.11]主干网络C3替换为轻量化网络MobileNetV3(yolov5m)

    [YOLOv7/YOLOv5系列算法改进NO.11]主干网络C3替换为轻量化网络MobileNetV3(yolov5m)

  • 织梦模板Dedecms织梦文件目录结构全面解析教程(织梦模板如何安装)

    织梦模板Dedecms织梦文件目录结构全面解析教程(织梦模板如何安装)

  • 酒店维修费计入什么费用
  • 建筑公司多个项目,增值税收入确认
  • 机耕道属于水利还是土地整治
  • 缴纳社保的基数是什么意思
  • 中小企业存货内部控制存在的问题以公司为例
  • 增值税专用发票抵扣税额是什么意思
  • 计提社保费计入什么科目
  • 收到供应商赔偿的违约金
  • 收到高新企业补助款分录
  • 行政单位发放的政府补贴款
  • 对方跨行转账成功后我却没收到款还能追回来吗
  • 应收账款平均余额公式
  • 外购的货物用于集体福利是销售吗
  • 营改增对企业的影响案例
  • 营改增后工程税收怎么计算
  • 专用发票第一次怎么开
  • 中药饮片适用增值税税率
  • 当月开票一定要当月入账吗
  • 对境外支付佣金的规定
  • 环保税计入项目成本吗
  • 固定资产一次计入成本费用
  • 境外公司在境内取得的收入如何交税
  • 增值税专用发票丢了怎么补救
  • php如何解决异常处理
  • windows 11预览版
  • 企业个人借款会计分录
  • linux编译安装php扩展命令
  • 阿尔卑斯山百度百科
  • 购入专利权属于
  • 44岁就没有月经了正常吗
  • 可变现净值高于成本是什么意思
  • phpcms程序
  • 未取得增值税发票开具二手车发票
  • 固定资产计提完折旧报废的账务处理
  • 帝国cms使用手册
  • 帝国cms文件夹介绍
  • 其他应收款增加会计分录
  • 现代服务税目包括哪些具体分类
  • linux大版本升级
  • 企业固定资产如何查询
  • 投资性公司怎么做账
  • 出口免抵增值税税率
  • 免征增值税政策的政策有哪些?
  • 境外服务费代扣代缴所得税怎么做账
  • 施工企业预估成本怎么算
  • 注册资本认缴制的利弊
  • 租金摊销会计分录
  • 如何在电子税务局变更办税人员
  • 收到的赔款罚款怎么做账
  • 材料采购办法
  • win10安装sqlserver2016出错
  • mysql触发事件
  • windows更新9%
  • windows优化软件
  • 远程查看微信聊天记录软件
  • MAC怎么将单独一个应用静音
  • 使用u盘安装win10
  • win10windows更新
  • ubuntu20桌面
  • linux小技巧
  • win8计算机管理员权限
  • win10鼠标箭头怎么换样式
  • windows xp windows
  • win10家庭版系统怎么样
  • linux learn
  • 深入理解新发展理念,推进供给侧结构性改革 心得体会
  • css网页布局在线生成
  • perl脚本调试方法
  • unity3d插件手机版
  • angularjs时间控件
  • js过滤filter
  • uleb128、sleb128和uleb128p1编码格式介绍
  • ubuntu创建虚拟网卡
  • javascript编写
  • javascript的对象有哪些
  • python模拟登陆并抓取
  • python怎么用数组
  • 网页js调试
  • python sco
  • 江西电子税务局官网
  • 免责声明:网站部分图片文字素材来源于网络,如有侵权,请及时告知,我们会第一时间删除,谢谢! 邮箱:opceo@qq.com

    鄂ICP备2023003026号

    网站地图: 企业信息 工商信息 财税知识 网络常识 编程技术

    友情链接: 武汉网站建设