Skip to content

Python爬虫

Conda 下载安装包进行安装, 使用conda管理Python环境 进行Python环境配置管理。

1.1 Beautiful Soup

Beautiful Soup 是一个可以从HTML或XML文件中提取数据的Python库.它能够通过你喜欢的转换器实现惯用的文档导航,查找,修改文档的方式.Beautiful Soup会帮你节省数小时甚至数天的工作时间.

安装Beautiful Soup

Terminal window
$ conda install beautifulsoup4
#安装解析器lxml
$ conda install lxml
  1. conda install 遇到 “Collecting package metadata (current_repodata.json): failed”
Terminal window
$ conda install lxml (p3) 0 [15:43:43]
Collecting package metadata (current_repodata.json): failed
CondaHTTPError: HTTP 000 CONNECTION FAILED for url <https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main/osx-64/current_repodata.json>
Elapsed: -
An HTTP error occurred when trying to retrieve this URL.
HTTP errors are often intermittent, and a simple retry will get you on your way.
'https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main/osx-64'

关闭conda的ssl验证即可:

Terminal window
$ conda config --set ssl_verify false
  1. https://conda.io/en/latest/miniconda.html
  2. Beautiful Soup 4.4.0 文档