http://www.sufeinet.com/plugin.php?id=keke_group

苏飞论坛

 找回密码
 马上注册

QQ登录

只需一步,快速开始

分布式系统框架(V2.0) 轻松承载百亿数据,千万流量!讨论专区 - 源码下载 - 官方教程

HttpHelper爬虫框架(V2.7-含.netcore) HttpHelper官方出品,爬虫框架讨论区 - 源码下载 - 在线测试和代码生成

HttpHelper爬虫类(V2.0) 开源的爬虫类,支持多种模式和属性 源码 - 代码生成器 - 讨论区 - 教程- 例子

查看: 4841|回复: 4

[总群] [2012-12-20][W@lf]使用HttpHelper注意ContentType 参数

[复制链接]
发表于 2012-12-20 11:22:42 | 显示全部楼层 |阅读模式
W@lf(326335) 10:45:47

谢谢了。还有个问题请教,为什么post的数据跟网页上返回的数据不一样。
W@lf(326335) 10:45:57

post参数都正确。。
[郑州]song、<song_xiaopeng@126.com> 10:46:44
未命名1.jpg
[重庆]Eagle(838010363) 10:48:56

苏飞童鞋
[重庆]Eagle(838010363) 10:48:59

在吗?
[深圳]茂茂(114440636) 10:49:29

群主刚刚飘过~!
[北京]SillyPGM(1545415453) 10:49:37

啥事
W@lf(326335) 10:49:42
未命名2.jpg
[北京]SillyPGM(1545415453) 10:50:02

有没有cookie啊
[北京]SillyPGM(1545415453) 10:50:08

注意了木有
[北京]SillyPGM(1545415453) 10:50:18

有没有动态参数
W@lf(326335) 10:50:22

不用cookies的
W@lf(326335) 10:50:30

值 都是固定的。
[郑州] 站长苏飞<sufei.1013@163.com> 10:51:00

得到的数据不完整,还是不对,你直接访问查看源码和使用HttpHelper类访问比较一下,
[重庆]Eagle(838010363) 10:51:16

苏飞
[北京]SillyPGM(1545415453) 10:51:27

post数据的日期怎么和referer不一样
[重庆]Eagle(838010363) 10:51:38

你的那个类传ip你是怎么设定的
[北京]SillyPGM(1545415453) 10:51:40

一个是18号,一个是20号
W@lf(326335) 10:51:47
未命名3.jpg
W@lf(326335) 10:51:51

这个是网页的
[重庆]Eagle(838010363) 10:51:53

设置代理那里
W@lf(326335) 10:52:03
[郑州] 站长苏飞<sufei.1013@163.com> 10:52:26

代理的IP是直接在HttpItem里,不过那块我没测试过,呵呵
[北京]SillyPGM(1545415453) 10:52:38

那么postdata不对的
[重庆]Eagle(838010363) 10:52:42
未命名4.jpg

W@lf(326335) 10:53:09

POST http://220.174.241.102:9808/sys/DayReport.aspx HTTP/1.1
Host: 220.174.241.102:9808
Connection: keep-alive
Content-Length: 302
Origin:
http://220.174.241.102:9808[/url]
X-Requested-With: XMLHttpRequest
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.4 (KHTML, like Gecko) Chrome/22.0.1229.92 Safari/537.4
Content-Type: application/x-www-form-urlencoded
Accept: text/plain, */*; q=0.01
Referer:
http://220.174.241.102:9808/sys/dayreport.aspx?CurrDate=2012-12-20[/url]
Accept-Encoding: gzip,deflate,sdch
Accept-Language: zh-CN,zh;q=0.8
Accept-Charset: GBK,utf-8;q=0.7,*;q=0.3

TimeType=day&BeginDate=2012-12-20&EndDate=2012-12-20&Item=101%2C141%2C107%2C106%2C108%2C105&Point='460100.001'%2C'460100.002'%2C'460100.003'%2C'460100.004'%2C'460100.005'&StandardName=%25u73AF%25u5883%25u7A7A%25u6C14%25u8D28%25u91CF%25u6807%25u51C6&StandardLevel=2&Judge=0&Source=0&FromHour=0&STLevel=0

[北京]SillyPGM(1545415453) 10:53:10

未命名5.jpg
[北京]SillyPGM(1545415453) 10:53:35
真讨厌发源码
W@lf(326335) 10:53:37

我发的是网页发送的。
[北京]SillyPGM(1545415453) 10:53:47

没有格式,乱七八糟
W@lf(326335) 10:53:48

那不发了。
W@lf(326335) 10:54:12

这是截取的raw
[郑州] 站长苏飞<sufei.1013@163.com> 10:54:21

你这个使用类获取的内容是什么,
W@lf(326335) 10:54:41

返回html
W@lf(326335) 10:55:29

这个页面。
[重庆]Eagle(838010363) 10:55:35

未命名6.jpg
W@lf(326335) 10:56:29

他页面有个ajax
[郑州] 站长苏飞<sufei.1013@163.com> 10:57:44

恩,界面使用了Ajax,不是HttpHelper类的问题,你应该去访问Ajax获取数据的那个方法
[重庆]Eagle(838010363) 10:58:05

TimeType=day&BeginDate=2012-12-20&EndDate=2012-12-20&Item=101%2C141%2C107%2C106%2C108%2C105&Point='460100.001'%2C'460100.002'%2C'460100.003'%2C'460100.004'%2C'460100.005'&StandardName=%25u73AF%25u5883%25u7A7A%25u6C14%25u8D28%25u91CF%25u6807%25u51C6&StandardLevel=2&Judge=0&Source=0&FromHour=0&STLevel=0
[重庆]Eagle(838010363) 10:58:13

这个是提交的数据
W@lf(326335) 10:58:39

嗯。postdata就是这个了
W@lf(326335) 10:59:18

ajax也是发送请求吧。这个也可以用httphelper吧
[郑州] 站长苏飞<sufei.1013@163.com> 11:00:02

Ajax是可以用,他是你不能访问包含Ajax请求的页面,而是应该访问Ajax请求的页面哦
[重庆]Eagle(838010363) 11:00:20

发送回来的是 string
W@lf(326335) 11:00:56

都是同一个页面。
W@lf(326335) 11:01:08

ajax也是请求当前页的

[重庆]Eagle(838010363) 11:01:49

返回回来的是
<table>
........
</table>

W@lf(326335) 11:01:57

嗯。


[郑州]song、<song_xiaopeng@126.com> 11:02:13

你们说的太深奥 ;了
菜鸟小贝.中山(408599029) 11:02:16

ajax是请求服务器的吧..
[北京]SillyPGM(1545415453) 11:02:30

手机党,坐看你们抓包
菜鸟小贝.中山(408599029) 11:02:33

ajax是一个虚拟http socket的东西..
W@lf(326335) 11:04:51
用httphelper跟原来的包不一样。
W@lf(326335) 11:05:09
是不是哪里写错了,还是postdata里面不能包含单引号?
W@lf(326335) 11:08:33
可以了。
W@lf(326335) 11:08:39
加个ContentType = "application/x-www-form-urlencoded",就好了
W@lf(326335) 11:10:34
谢谢各位了。
[郑州] 站长苏飞<sufei.1013@163.com> 11:12:51
你这个不需要加Referer属性也是可行的,
W@lf(326335) 11:14:27
嗯,这些以前没做过,也不知道发包发注意什么东西。
.
苏飞总结:
1.其实这个问题关于在于ContentType 这个属性,表示传入请求的 MIME 内容类型的字符串
2.那当然传入不同返回的也是不同的。大家需要注册这点

本帖被以下淘专辑推荐:



1. 开通SVIP会员,免费下载本站所有源码,不限次数据,不限时间
2. 加官方QQ群,加官方微信群获取更多资源和帮助
3. 找站长苏飞做网站、商城、CRM、小程序、App、爬虫相关、项目外包等点这里
 楼主| 发表于 2012-12-20 11:28:39 | 显示全部楼层
我再补充一下最新的代码,只要执行以下代码就可以正常获取了
[code=csharp] HttpHelper http = new HttpHelper();
            HttpItem item = new HttpItem()
            {
                URL = "http://220.174.241.102:9808/sys/DayReport.aspx",//URL     必需项
                Encoding = "gbk",//编码格式(utf-8,gb2312,gbk)     可选项 默认类会自动识别
                Method = "Post",//URL     可选项 默认为Get
               
                ContentType = "application/x-www-form-urlencoded",//返回类型    可选项有默认值
              
                Postdata = "TimeType=day&BeginDate=2012-12-20&EndDate=2012-12-20&Item=101%2C141%2C107%2C106%2C108%2C105&Point='460100.001'%2C'460100.002'%2C'460100.003'%2C'460100.004'%2C'460100.005'&StandardName=%25u73AF%25u5883%25u7A7A%25u6C14%25u8D28%25u91CF%25u6807%25u51C6&StandardLevel=2&Judge=0&Source=0&FromHour=0&STLevel=0",//Post数据     可选项GET时不需要写
            
            };
            //得到HTML代码
            string html = http.GetHtml(item);[/code]
 楼主| 发表于 2012-12-20 11:32:19 | 显示全部楼层
ContentType 这个请大家以后注意了哦
看下面列表
  1. ContentTypes : "ez","application/andrew-inset"
  2.    ContentTypes : "hqx","application/mac-binhex40"
  3.    ContentTypes : "cpt","application/mac-compactpro"
  4.    ContentTypes : "doc","application/msword"
  5.    ContentTypes : "bin","application/octet-stream"
  6.    ContentTypes : "dms","application/octet-stream"
  7.    ContentTypes : "lha","application/octet-stream"
  8.    ContentTypes : "lzh","application/octet-stream"
  9.    ContentTypes : "exe","application/octet-stream"
  10.    ContentTypes : "class","application/octet-stream"
  11.    ContentTypes : "so","application/octet-stream"
  12.    ContentTypes : "dll","application/octet-stream"
  13.    ContentTypes : "oda","application/oda"
  14.    ContentTypes : "pdf","application/pdf"
  15.    ContentTypes : "ai","application/postscript"
  16.    ContentTypes : "eps","application/postscript"
  17.    ContentTypes : "ps","application/postscript"
  18.    ContentTypes : "smi","application/smil"
  19.    ContentTypes : "smil","application/smil"
  20.    ContentTypes : "mif","application/vnd.mif"
  21.    ContentTypes : "xls","application/vnd.ms-excel"
  22.    ContentTypes : "ppt","application/vnd.ms-powerpoint"
  23.    ContentTypes : "wbxml","application/vnd.wap.wbxml"
  24.    ContentTypes : "wmlc","application/vnd.wap.wmlc"
  25.    ContentTypes : "wmlsc","application/vnd.wap.wmlscriptc"
  26.    ContentTypes : "bcpio","application/x-bcpio"
  27.    ContentTypes : "vcd","application/x-cdlink"
  28.    ContentTypes : "pgn","application/x-chess-pgn"
  29.    ContentTypes : "cpio","application/x-cpio"
  30.    ContentTypes : "csh","application/x-csh"
  31.    ContentTypes : "dcr","application/x-director"
  32.    ContentTypes : "dir","application/x-director"
  33.    ContentTypes : "dxr","application/x-director"
  34.    ContentTypes : "dvi","application/x-dvi"
  35.    ContentTypes : "spl","application/x-futuresplash"
  36.    ContentTypes : "gtar","application/x-gtar"
  37.    ContentTypes : "hdf","application/x-hdf"
  38.    ContentTypes : "js","application/x-javascript"
  39.    ContentTypes : "skp","application/x-koan"
  40.    ContentTypes : "skd","application/x-koan"
  41.    ContentTypes : "skt","application/x-koan"
  42.    ContentTypes : "skm","application/x-koan"
  43.    ContentTypes : "latex","application/x-latex"
  44.    ContentTypes : "nc","application/x-netcdf"
  45.    ContentTypes : "cdf","application/x-netcdf"
  46.    ContentTypes : "sh","application/x-sh"
  47.    ContentTypes : "shar","application/x-shar"
  48.    ContentTypes : "swf","application/x-shockwave-flash"
  49.    ContentTypes : "sit","application/x-stuffit"
  50.    ContentTypes : "sv4cpio","application/x-sv4cpio"
  51.    ContentTypes : "sv4crc","application/x-sv4crc"
  52.    ContentTypes : "tar","application/x-tar"
  53.    ContentTypes : "tcl","application/x-tcl"
  54.    ContentTypes : "tex","application/x-tex"
  55.    ContentTypes : "texinfo","application/x-texinfo"
  56.    ContentTypes : "texi","application/x-texinfo"
  57.    ContentTypes : "t","application/x-troff"
  58.    ContentTypes : "tr","application/x-troff"
  59.    ContentTypes : "roff","application/x-troff"
  60.    ContentTypes : "man","application/x-troff-man"
  61.    ContentTypes : "me","application/x-troff-me"
  62.    ContentTypes : "ms","application/x-troff-ms"
  63.    ContentTypes : "ustar","application/x-ustar"
  64.    ContentTypes : "src","application/x-wais-source"
  65.    ContentTypes : "xhtml","application/xhtml+xml"
  66.    ContentTypes : "xht","application/xhtml+xml"
  67.    ContentTypes : "zip","application/zip"
  68.    ContentTypes : "au","audio/basic"
  69.    ContentTypes : "snd","audio/basic"
  70.    ContentTypes : "mid","audio/midi"
  71.    ContentTypes : "midi","audio/midi"
  72.    ContentTypes : "kar","audio/midi"
  73.    ContentTypes : "mpga","audio/mpeg"
  74.    ContentTypes : "mp2","audio/mpeg"
  75.    ContentTypes : "mp3","audio/mpeg"
  76.    ContentTypes : "aif","audio/x-aiff"
  77.    ContentTypes : "aiff","audio/x-aiff"
  78.    ContentTypes : "aifc","audio/x-aiff"
  79.    ContentTypes : "m3u","audio/x-mpegurl"
  80.    ContentTypes : "ram","audio/x-pn-realaudio"
  81.    ContentTypes : "rm","audio/x-pn-realaudio"
  82.    ContentTypes : "rpm","audio/x-pn-realaudio-plugin"
  83.    ContentTypes : "ra","audio/x-realaudio"
  84.    ContentTypes : "wav","audio/x-wav"
  85.    ContentTypes : "pdb","chemical/x-pdb"
  86.    ContentTypes : "xyz","chemical/x-xyz"
  87.    ContentTypes : "bmp","image/bmp"
  88.    ContentTypes : "gif","image/gif"
  89.    ContentTypes : "ief","image/ief"
  90.    ContentTypes : "jpeg","image/jpeg"
  91.    ContentTypes : "jpg","image/jpeg"
  92.    ContentTypes : "jpe","image/jpeg"
  93.    ContentTypes : "png","image/png"
  94.    ContentTypes : "tiff","image/tiff"
  95.    ContentTypes : "tif","image/tiff"
  96.    ContentTypes : "djvu","image/vnd.djvu"
  97.    ContentTypes : "djv","image/vnd.djvu"
  98.    ContentTypes : "wbmp","image/vnd.wap.wbmp"
  99.    ContentTypes : "ras","image/x-cmu-raster"
  100.    ContentTypes : "pnm","image/x-portable-anymap"
  101.    ContentTypes : "pbm","image/x-portable-bitmap"
  102.    ContentTypes : "pgm","image/x-portable-graymap"
  103.    ContentTypes : "ppm","image/x-portable-pixmap"
  104.    ContentTypes : "rgb","image/x-rgb"
  105.    ContentTypes : "xbm","image/x-xbitmap"
  106.    ContentTypes : "xpm","image/x-xpixmap"
  107.    ContentTypes : "xwd","image/x-xwindowdump"
  108.    ContentTypes : "igs","model/iges"
  109.    ContentTypes : "iges","model/iges"
  110.    ContentTypes : "msh","model/mesh"
  111.    ContentTypes : "mesh","model/mesh"
  112.    ContentTypes : "silo","model/mesh"
  113.    ContentTypes : "wrl","model/vrml"
  114.    ContentTypes : "vrml","model/vrml"
  115.    ContentTypes : "css","text/css"
  116.    ContentTypes : "html","text/html"
  117.    ContentTypes : "htm","text/html"
  118.    ContentTypes : "asc","text/plain"
  119.    ContentTypes : "txt","text/plain"
  120.    ContentTypes : "rtx","text/richtext"
  121.    ContentTypes : "rtf","text/rtf"
  122.    ContentTypes : "sgml","text/sgml"
  123.    ContentTypes : "sgm","text/sgml"
  124.    ContentTypes : "tsv","text/tab-separated-values"
  125.    ContentTypes : "wml","text/vnd.wap.wml"
  126.    ContentTypes : "wmls","text/vnd.wap.wmlscript"
  127.    ContentTypes : "etx","text/x-setext"
  128.    ContentTypes : "xsl","text/xml"
  129.    ContentTypes : "xml","text/xml"
  130.    ContentTypes : "mpeg","video/mpeg"
  131.    ContentTypes : "mpg","video/mpeg"
  132.    ContentTypes : "mpe","video/mpeg"
  133.    ContentTypes : "qt","video/quicktime"
  134.    ContentTypes : "mov","video/quicktime"
  135.    ContentTypes : "mxu","video/vnd.mpegurl"
  136.    ContentTypes : "avi","video/x-msvideo"
  137.    ContentTypes : "movie","video/x-sgi-movie"
  138.    ContentTypes : "ice","x-conference/x-cooltalk"
  139.    ContentTypes : "form","application/x-www-form-urlencoded"
复制代码
发表于 2012-12-28 08:33:07 | 显示全部楼层
我去~ 这么全  又打算拿数据包..欺骗善良而单纯服务器了么~
 楼主| 发表于 2012-12-28 09:40:42 | 显示全部楼层
幻雪丶逆时光 发表于 2012-12-28 08:33
我去~ 这么全  又打算拿数据包..欺骗善良而单纯服务器了么~

难道浏览器不是吗?一样的,我们只是用浏览器的功能,页不去实现界面化
您需要登录后才可以回帖 登录 | 马上注册

本版积分规则

QQ|手机版|小黑屋|手机版|联系我们|关于我们|广告合作|苏飞论坛 ( 豫ICP备18043678号-2)

GMT+8, 2024-4-30 06:55

© 2014-2021

快速回复 返回顶部 返回列表