python的Pattern模块
Pattern is a web mining module for the Python programming language.
It bundles tools for data retrieval (Google + Twitter + Wikipedia API, web spider, HTML DOM parser), text analysis (rule-based shallow parser, WordNet interface, syntactical + semantical n-gram search algorithm, tf-idf + cosine similarity + LSA metrics), clustering and classification (k-means, KNN, SVM), and data visualization (graph networks).
The module is bundled with 30+ example scripts and 350+ unit tests.
Installation
Pattern is written for Python 2.5+ (no support for Python 3 yet). The module has no external dependencies except when using LSA in the vector module, which requires NumPy (installed by default on Mac OS X).
To install it so that the module is available in all your scripts, open a terminal and do:
> cd pattern-2.4
> python setup.py install
If you have pip, you can automatically download and install from the PyPi repository:
> pip install pattern
If none of the above works, you can make Python aware of the module in three ways:
- Put the pattern subfolder in the same folder as your script.
- Put the pattern subfolder in the standard location for modules so it is available to all scripts:
c:\python25\Lib\site-packages\ (Windows),
/Library/Python/2.5/site-packages/ (Mac OS X),
/usr/lib/python2.5/site-packages/ (Unix). - Add the location of the module to sys.path in your script, before importing it:
>>>
MODULE = '/users/tom/desktop/pattern' >>>
import
sys; if
MODULE not
in
sys.path: sys.path.append(MODULE) >>>
from
pattern.en import
parse, Sentence |
免责声明:
① 本站未注明“稿件来源”的信息均来自网络整理。其文字、图片和音视频稿件的所属权归原作者所有。本站收集整理出于非商业性的教育和科研之目的,并不意味着本站赞同其观点或证实其内容的真实性。仅作为临时的测试数据,供内部测试之用。本站并未授权任何人以任何方式主动获取本站任何信息。
② 本站未注明“稿件来源”的临时测试数据将在测试完成后最终做删除处理。有问题或投稿请发送至: 邮箱/279061341@qq.com QQ/279061341