histcite-python


Namehistcite-python JSON
Version 2.1.0 PyPI version JSON
download
home_pageNone
SummaryA Python interface for histcite
upload_time2024-05-05 13:05:21
maintainerNone
docs_urlNone
authorNone
requires_python>=3.9
licenseMIT
keywords histcite citation network web of science scopus cssci
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # HistCite 工具的 Python 实现

[![PyPI](https://img.shields.io/pypi/v/histcite-python)](https://pypi.org/project/histcite-python)
[![Supported Versions](https://img.shields.io/pypi/pyversions/histcite-python.svg)](https://pypi.org/project/histcite-python)
[![Codecov](https://codecov.io/gh/doublessay/histcite-python/graph/badge.svg?token=99V9E2CI1H)](https://codecov.io/gh/doublessay/histcite-python)
[![License](https://img.shields.io/pypi/l/histcite-python.svg)](https://github.com/doublessay/histcite-python/blob/main/LICENSE)

由于原引文分析工具 [HistCite](https://support.clarivate.com/ScientificandAcademicResearch/s/article/HistCite-No-longer-in-active-development-or-officially-supported) 已停止维护,目前国内使用较多的为中科大某同学 (知乎昵称 [Tsing](https://www.zhihu.com/people/wq123)) 在源程序基础上修复的版本 [HistCite Pro](https://zhuanlan.zhihu.com/p/20902898),仅适用于 `Windows` 平台,存在较大限制。借助 [pandas 2.0](https://pandas.pydata.org/docs/dev/index.html) 和可视化工具 [Graphviz](https://graphviz.org),本工具实现了 `HistCite` 的核心功能,可以跨平台使用,同时拓展了对 [其他数据源](#数据准备) 的支持。

核心功能:
- 生成引文网络图;
- 生成统计数据,包括文献、作者、机构、文献来源、作者关键词等分析对象;

工具对比:
|对比项|histcite-python|histcite pro|
|:----|:----|:----|
|是否开源|是|否|
|是否跨平台|是|否,仅限 Windows|
|是否支持其他数据源|是|否,仅限 Web of Science|
|是否提供前端界面|否|是,可交互|
|引文网络图|矢量图,比较清晰|位图,比较模糊|

## 快速开始
```console
$ pip install histcite-python
```

## 数据准备
|数据来源|下载说明|原始文件名|
|:----|:----|:----|
|Web of Science|`核心合集`,格式选择 `Tab delimited file` 或 `Plain text file`,导出内容选择 `Full Record and Cited References` 或者是 `Custom selection`,全选字段。|`savedrecs*.txt`|
|CSSCI|从 `CSSCI数据库` 正常导出即可。|`LY_*.txt`|
|Scopus|网站语言切换到英文,格式选择 `CSV` 文件,导出字段需要额外勾选 `Author keywords` 和 `Include references`,或者直接全选字段。|`scopus*.csv`|

> [!WARNING]
> 文件下载后不要重命名(会根据文件名识别有效的数据文件),把下载的所有文件放在一个单独的文件夹内。

## 使用方法
1. 使用命令行工具
```console
$ histcite -h
usage: histcite [-h] (--top TOP | --threshold THRESHOLD | --node NODE) [--disable_timeline] folder_path {wos,cssci,scopus}

A Python interface for histcite.

positional arguments:
  folder_path           Folder path of downloaded data.
  {wos,cssci,scopus}    Data source.

options:
  -h, --help            show this help message and exit
  --top                 Top N nodes with the highest LCS.
  --threshold           Nodes with LCS greater than threshold.
  --disable_timeline    Whether to disable timeline.
```

```console
$ histcite /Users/.../Downloads/dataset wos --top 50
```

> [!NOTE]
> 生成的结果保存在 `folder_path` 下的 `result` 文件夹内,包含
> - 引文网络图节点信息表 graph_node_info.xlsx
> - 引文网络图的数据文件 graph.dot
>     - 借助 [Graphviz 在线编辑器](http://magjac.com/graphviz-visual-editor/) 或下载到本地的 [Graphviz](https://graphviz.org/) 生成引文网络图。

引文网络图示例:

![](https://raw.githubusercontent.com/doublessay/histcite-python/main/examples/graph.svg)

对应的节点信息如下(以 CSSCI 数据源为例):
| |AU|TI|PY|SO|LCS|
|:----|:----|:----|:----|:----|:----|
|55|张坤; 查先进|我国智慧图书馆的发展沿革及构建策略研究|2021|国家图书馆学刊|6|
|60|石婷婷; 徐建华; 张雨浓|数字孪生技术驱动下的智慧图书馆应用场景与体系架构设计|2021|情报理论与实践|7|
|63|卢小宾; 宋姬芳; 蒋玲; 洪先锋; 刘静; 张薷|智慧图书馆建设标准探析|2021|中国图书馆学报|9|
|81|程焕文; 钟远薪|智慧图书馆的三维解析|2021|图书馆论坛|10|
|86|段美珍; 初景利; 张冬荣; 解贺嘉|智慧图书馆的内涵特点及其认知模型研究|2021|图书情报工作|7|
|...| | | | | |

2. 使用 Jupyter Notebook,可以自定义更多参数,导出描述性统计数据,请查看 [demo.ipynb](demo.ipynb)

## 字段说明
|Field Name|Description|
|:----|:----|
|`GCS`|Global Citation Score, 表示一篇文献在文献数据库中的总被引次数|
|`LCS`|Local Citation Score, 表示一篇文献在本地文献集中的被引次数|
|`GCR`|Global Cited References, 表示一篇文献的参考文献数量|
|`LCR`|Local Cited References, 表示一篇文献的参考文献在本地文献集中的数量|
|`T*` |Total score, e.g. TLCS = Total Local Citation Scores.|
|`Recs`|Count of Records|
|`FAU`|First Author|
|`CAU`|Corresponding Authors|
|`AU`|Authors|
|`CO`|Country of Authors|
|`C1`|Addresses|
|`C3`|Author Affiliations|
|`RP`|Reprint Address|
|`I2`|Institution with Subdivision|
|`TI`|Article Title|
|`SO`|Source Title|
|`DT`|Document Type|
|`DE`|Author Keywords|
|`CR`|Cited References|
|`NR`|Cited Reference Count|
|`TC`|Times Cited Count|
|`PY`|Publication Year|
|`VL`|Volume|
|`IS`|Issue|
|`BP`|Start Page|
|`EP`|End Page|
|`DI`|DOI|
|...|[Please refer to Web of Science fields.](https://webofscience.help.clarivate.com/en-us/Content/export-records.htm)|

## FAQ
1. 为什么生成的引文网络图时间线会错乱?
- 节点位置由 Graphviz 自动调整,节点数量较少时容易出现这一问题。可以通过设置参数隐藏时间线。

2. 是否存在其他类似的工具?
- [CiteSpace](https://citespace.podia.com/)
- [CitNetExplorer](https://www.citnetexplorer.nl/)
- [VOSviewer](https://www.vosviewer.com/)
- [Connected Papers](https://www.connectedpapers.com/)
- [Litmaps](https://app.litmaps.com/)

            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "histcite-python",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.9",
    "maintainer_email": null,
    "keywords": "histcite, citation network, web of science, scopus, cssci",
    "author": null,
    "author_email": "WangK2 <kw221225@gmail.com>",
    "download_url": "https://files.pythonhosted.org/packages/c5/59/caf1cb744a17e6256c7d5141d8130de8426e6509ec1685526cee3b24e53b/histcite_python-2.1.0.tar.gz",
    "platform": null,
    "description": "# HistCite \u5de5\u5177\u7684 Python \u5b9e\u73b0\n\n[![PyPI](https://img.shields.io/pypi/v/histcite-python)](https://pypi.org/project/histcite-python)\n[![Supported Versions](https://img.shields.io/pypi/pyversions/histcite-python.svg)](https://pypi.org/project/histcite-python)\n[![Codecov](https://codecov.io/gh/doublessay/histcite-python/graph/badge.svg?token=99V9E2CI1H)](https://codecov.io/gh/doublessay/histcite-python)\n[![License](https://img.shields.io/pypi/l/histcite-python.svg)](https://github.com/doublessay/histcite-python/blob/main/LICENSE)\n\n\u7531\u4e8e\u539f\u5f15\u6587\u5206\u6790\u5de5\u5177 [HistCite](https://support.clarivate.com/ScientificandAcademicResearch/s/article/HistCite-No-longer-in-active-development-or-officially-supported) \u5df2\u505c\u6b62\u7ef4\u62a4\uff0c\u76ee\u524d\u56fd\u5185\u4f7f\u7528\u8f83\u591a\u7684\u4e3a\u4e2d\u79d1\u5927\u67d0\u540c\u5b66 (\u77e5\u4e4e\u6635\u79f0 [Tsing](https://www.zhihu.com/people/wq123)) \u5728\u6e90\u7a0b\u5e8f\u57fa\u7840\u4e0a\u4fee\u590d\u7684\u7248\u672c [HistCite Pro](https://zhuanlan.zhihu.com/p/20902898)\uff0c\u4ec5\u9002\u7528\u4e8e `Windows` \u5e73\u53f0\uff0c\u5b58\u5728\u8f83\u5927\u9650\u5236\u3002\u501f\u52a9 [pandas 2.0](https://pandas.pydata.org/docs/dev/index.html) \u548c\u53ef\u89c6\u5316\u5de5\u5177 [Graphviz](https://graphviz.org)\uff0c\u672c\u5de5\u5177\u5b9e\u73b0\u4e86 `HistCite` \u7684\u6838\u5fc3\u529f\u80fd\uff0c\u53ef\u4ee5\u8de8\u5e73\u53f0\u4f7f\u7528\uff0c\u540c\u65f6\u62d3\u5c55\u4e86\u5bf9 [\u5176\u4ed6\u6570\u636e\u6e90](#\u6570\u636e\u51c6\u5907) \u7684\u652f\u6301\u3002\n\n\u6838\u5fc3\u529f\u80fd\uff1a\n- \u751f\u6210\u5f15\u6587\u7f51\u7edc\u56fe\uff1b\n- \u751f\u6210\u7edf\u8ba1\u6570\u636e\uff0c\u5305\u62ec\u6587\u732e\u3001\u4f5c\u8005\u3001\u673a\u6784\u3001\u6587\u732e\u6765\u6e90\u3001\u4f5c\u8005\u5173\u952e\u8bcd\u7b49\u5206\u6790\u5bf9\u8c61\uff1b\n\n\u5de5\u5177\u5bf9\u6bd4\uff1a\n|\u5bf9\u6bd4\u9879|histcite-python|histcite pro|\n|:----|:----|:----|\n|\u662f\u5426\u5f00\u6e90|\u662f|\u5426|\n|\u662f\u5426\u8de8\u5e73\u53f0|\u662f|\u5426\uff0c\u4ec5\u9650 Windows|\n|\u662f\u5426\u652f\u6301\u5176\u4ed6\u6570\u636e\u6e90|\u662f|\u5426\uff0c\u4ec5\u9650 Web of Science|\n|\u662f\u5426\u63d0\u4f9b\u524d\u7aef\u754c\u9762|\u5426|\u662f\uff0c\u53ef\u4ea4\u4e92|\n|\u5f15\u6587\u7f51\u7edc\u56fe|\u77e2\u91cf\u56fe\uff0c\u6bd4\u8f83\u6e05\u6670|\u4f4d\u56fe\uff0c\u6bd4\u8f83\u6a21\u7cca|\n\n## \u5feb\u901f\u5f00\u59cb\n```console\n$ pip install histcite-python\n```\n\n## \u6570\u636e\u51c6\u5907\n|\u6570\u636e\u6765\u6e90|\u4e0b\u8f7d\u8bf4\u660e|\u539f\u59cb\u6587\u4ef6\u540d|\n|:----|:----|:----|\n|Web of Science|`\u6838\u5fc3\u5408\u96c6`\uff0c\u683c\u5f0f\u9009\u62e9 `Tab delimited file` \u6216 `Plain text file`\uff0c\u5bfc\u51fa\u5185\u5bb9\u9009\u62e9 `Full Record and Cited References` \u6216\u8005\u662f `Custom selection`\uff0c\u5168\u9009\u5b57\u6bb5\u3002|`savedrecs*.txt`|\n|CSSCI|\u4ece `CSSCI\u6570\u636e\u5e93` \u6b63\u5e38\u5bfc\u51fa\u5373\u53ef\u3002|`LY_*.txt`|\n|Scopus|\u7f51\u7ad9\u8bed\u8a00\u5207\u6362\u5230\u82f1\u6587\uff0c\u683c\u5f0f\u9009\u62e9 `CSV` \u6587\u4ef6\uff0c\u5bfc\u51fa\u5b57\u6bb5\u9700\u8981\u989d\u5916\u52fe\u9009 `Author keywords` \u548c `Include references`\uff0c\u6216\u8005\u76f4\u63a5\u5168\u9009\u5b57\u6bb5\u3002|`scopus*.csv`|\n\n> [!WARNING]\n> \u6587\u4ef6\u4e0b\u8f7d\u540e\u4e0d\u8981\u91cd\u547d\u540d(\u4f1a\u6839\u636e\u6587\u4ef6\u540d\u8bc6\u522b\u6709\u6548\u7684\u6570\u636e\u6587\u4ef6)\uff0c\u628a\u4e0b\u8f7d\u7684\u6240\u6709\u6587\u4ef6\u653e\u5728\u4e00\u4e2a\u5355\u72ec\u7684\u6587\u4ef6\u5939\u5185\u3002\n\n## \u4f7f\u7528\u65b9\u6cd5\n1. \u4f7f\u7528\u547d\u4ee4\u884c\u5de5\u5177\n```console\n$ histcite -h\nusage: histcite [-h] (--top TOP | --threshold THRESHOLD | --node NODE) [--disable_timeline] folder_path {wos,cssci,scopus}\n\nA Python interface for histcite.\n\npositional arguments:\n  folder_path           Folder path of downloaded data.\n  {wos,cssci,scopus}    Data source.\n\noptions:\n  -h, --help            show this help message and exit\n  --top                 Top N nodes with the highest LCS.\n  --threshold           Nodes with LCS greater than threshold.\n  --disable_timeline    Whether to disable timeline.\n```\n\n```console\n$ histcite /Users/.../Downloads/dataset wos --top 50\n```\n\n> [!NOTE]\n> \u751f\u6210\u7684\u7ed3\u679c\u4fdd\u5b58\u5728 `folder_path` \u4e0b\u7684 `result` \u6587\u4ef6\u5939\u5185\uff0c\u5305\u542b\n> - \u5f15\u6587\u7f51\u7edc\u56fe\u8282\u70b9\u4fe1\u606f\u8868 graph_node_info.xlsx\n> - \u5f15\u6587\u7f51\u7edc\u56fe\u7684\u6570\u636e\u6587\u4ef6 graph.dot\n>     - \u501f\u52a9 [Graphviz \u5728\u7ebf\u7f16\u8f91\u5668](http://magjac.com/graphviz-visual-editor/) \u6216\u4e0b\u8f7d\u5230\u672c\u5730\u7684 [Graphviz](https://graphviz.org/) \u751f\u6210\u5f15\u6587\u7f51\u7edc\u56fe\u3002\n\n\u5f15\u6587\u7f51\u7edc\u56fe\u793a\u4f8b\uff1a\n\n![](https://raw.githubusercontent.com/doublessay/histcite-python/main/examples/graph.svg)\n\n\u5bf9\u5e94\u7684\u8282\u70b9\u4fe1\u606f\u5982\u4e0b(\u4ee5 CSSCI \u6570\u636e\u6e90\u4e3a\u4f8b)\uff1a\n| |AU|TI|PY|SO|LCS|\n|:----|:----|:----|:----|:----|:----|\n|55|\u5f20\u5764; \u67e5\u5148\u8fdb|\u6211\u56fd\u667a\u6167\u56fe\u4e66\u9986\u7684\u53d1\u5c55\u6cbf\u9769\u53ca\u6784\u5efa\u7b56\u7565\u7814\u7a76|2021|\u56fd\u5bb6\u56fe\u4e66\u9986\u5b66\u520a|6|\n|60|\u77f3\u5a77\u5a77; \u5f90\u5efa\u534e; \u5f20\u96e8\u6d53|\u6570\u5b57\u5b6a\u751f\u6280\u672f\u9a71\u52a8\u4e0b\u7684\u667a\u6167\u56fe\u4e66\u9986\u5e94\u7528\u573a\u666f\u4e0e\u4f53\u7cfb\u67b6\u6784\u8bbe\u8ba1|2021|\u60c5\u62a5\u7406\u8bba\u4e0e\u5b9e\u8df5|7|\n|63|\u5362\u5c0f\u5bbe; \u5b8b\u59ec\u82b3; \u848b\u73b2; \u6d2a\u5148\u950b; \u5218\u9759; \u5f20\u85b7|\u667a\u6167\u56fe\u4e66\u9986\u5efa\u8bbe\u6807\u51c6\u63a2\u6790|2021|\u4e2d\u56fd\u56fe\u4e66\u9986\u5b66\u62a5|9|\n|81|\u7a0b\u7115\u6587; \u949f\u8fdc\u85aa|\u667a\u6167\u56fe\u4e66\u9986\u7684\u4e09\u7ef4\u89e3\u6790|2021|\u56fe\u4e66\u9986\u8bba\u575b|10|\n|86|\u6bb5\u7f8e\u73cd; \u521d\u666f\u5229; \u5f20\u51ac\u8363; \u89e3\u8d3a\u5609|\u667a\u6167\u56fe\u4e66\u9986\u7684\u5185\u6db5\u7279\u70b9\u53ca\u5176\u8ba4\u77e5\u6a21\u578b\u7814\u7a76|2021|\u56fe\u4e66\u60c5\u62a5\u5de5\u4f5c|7|\n|...| | | | | |\n\n2. \u4f7f\u7528 Jupyter Notebook\uff0c\u53ef\u4ee5\u81ea\u5b9a\u4e49\u66f4\u591a\u53c2\u6570\uff0c\u5bfc\u51fa\u63cf\u8ff0\u6027\u7edf\u8ba1\u6570\u636e\uff0c\u8bf7\u67e5\u770b [demo.ipynb](demo.ipynb)\n\n## \u5b57\u6bb5\u8bf4\u660e\n|Field Name|Description|\n|:----|:----|\n|`GCS`|Global Citation Score, \u8868\u793a\u4e00\u7bc7\u6587\u732e\u5728\u6587\u732e\u6570\u636e\u5e93\u4e2d\u7684\u603b\u88ab\u5f15\u6b21\u6570|\n|`LCS`|Local Citation Score, \u8868\u793a\u4e00\u7bc7\u6587\u732e\u5728\u672c\u5730\u6587\u732e\u96c6\u4e2d\u7684\u88ab\u5f15\u6b21\u6570|\n|`GCR`|Global Cited References, \u8868\u793a\u4e00\u7bc7\u6587\u732e\u7684\u53c2\u8003\u6587\u732e\u6570\u91cf|\n|`LCR`|Local Cited References, \u8868\u793a\u4e00\u7bc7\u6587\u732e\u7684\u53c2\u8003\u6587\u732e\u5728\u672c\u5730\u6587\u732e\u96c6\u4e2d\u7684\u6570\u91cf|\n|`T*` |Total score, e.g. TLCS = Total Local Citation Scores.|\n|`Recs`|Count of Records|\n|`FAU`|First Author|\n|`CAU`|Corresponding Authors|\n|`AU`|Authors|\n|`CO`|Country of Authors|\n|`C1`|Addresses|\n|`C3`|Author Affiliations|\n|`RP`|Reprint Address|\n|`I2`|Institution with Subdivision|\n|`TI`|Article Title|\n|`SO`|Source Title|\n|`DT`|Document Type|\n|`DE`|Author Keywords|\n|`CR`|Cited References|\n|`NR`|Cited Reference Count|\n|`TC`|Times Cited Count|\n|`PY`|Publication Year|\n|`VL`|Volume|\n|`IS`|Issue|\n|`BP`|Start Page|\n|`EP`|End Page|\n|`DI`|DOI|\n|...|[Please refer to Web of Science fields.](https://webofscience.help.clarivate.com/en-us/Content/export-records.htm)|\n\n## FAQ\n1. \u4e3a\u4ec0\u4e48\u751f\u6210\u7684\u5f15\u6587\u7f51\u7edc\u56fe\u65f6\u95f4\u7ebf\u4f1a\u9519\u4e71\uff1f\n- \u8282\u70b9\u4f4d\u7f6e\u7531 Graphviz \u81ea\u52a8\u8c03\u6574\uff0c\u8282\u70b9\u6570\u91cf\u8f83\u5c11\u65f6\u5bb9\u6613\u51fa\u73b0\u8fd9\u4e00\u95ee\u9898\u3002\u53ef\u4ee5\u901a\u8fc7\u8bbe\u7f6e\u53c2\u6570\u9690\u85cf\u65f6\u95f4\u7ebf\u3002\n\n2. \u662f\u5426\u5b58\u5728\u5176\u4ed6\u7c7b\u4f3c\u7684\u5de5\u5177\uff1f\n- [CiteSpace](https://citespace.podia.com/)\n- [CitNetExplorer](https://www.citnetexplorer.nl/)\n- [VOSviewer](https://www.vosviewer.com/)\n- [Connected Papers](https://www.connectedpapers.com/)\n- [Litmaps](https://app.litmaps.com/)\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "A Python interface for histcite",
    "version": "2.1.0",
    "project_urls": {
        "Repository": "https://github.com/doublessay/histcite-python"
    },
    "split_keywords": [
        "histcite",
        " citation network",
        " web of science",
        " scopus",
        " cssci"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "40d0815057fd002fbbd4516e1beda3aaeb6f54970d7d37badf691c910a87ab43",
                "md5": "c323ebdd25c1a7a495b56cc6904c39c9",
                "sha256": "eda7b3e18caca3c599f5140d5832885f9a124ce7dd7c9cd1356430e9d6c0fe77"
            },
            "downloads": -1,
            "filename": "histcite_python-2.1.0-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "c323ebdd25c1a7a495b56cc6904c39c9",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.9",
            "size": 16551,
            "upload_time": "2024-05-05T13:05:20",
            "upload_time_iso_8601": "2024-05-05T13:05:20.065834Z",
            "url": "https://files.pythonhosted.org/packages/40/d0/815057fd002fbbd4516e1beda3aaeb6f54970d7d37badf691c910a87ab43/histcite_python-2.1.0-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "c559caf1cb744a17e6256c7d5141d8130de8426e6509ec1685526cee3b24e53b",
                "md5": "608f61123cea7946ee97bd7ced9534c8",
                "sha256": "5649f947b641555a376cfdb18825d279de46c9283c13eea444561f58d6c6e298"
            },
            "downloads": -1,
            "filename": "histcite_python-2.1.0.tar.gz",
            "has_sig": false,
            "md5_digest": "608f61123cea7946ee97bd7ced9534c8",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.9",
            "size": 19120,
            "upload_time": "2024-05-05T13:05:21",
            "upload_time_iso_8601": "2024-05-05T13:05:21.731713Z",
            "url": "https://files.pythonhosted.org/packages/c5/59/caf1cb744a17e6256c7d5141d8130de8426e6509ec1685526cee3b24e53b/histcite_python-2.1.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-05-05 13:05:21",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "doublessay",
    "github_project": "histcite-python",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "lcname": "histcite-python"
}
        
Elapsed time: 5.27486s