+---------------+---------------------+-------------------+
| |Lint Badge| | |Test Badge| | |Version Badge| |
+---------------+---------------------+-------------------+
.. |Lint Badge| image:: https://github.com/freelawproject/reporters-db/workflows/Lint/badge.svg
.. |Test Badge| image:: https://github.com/freelawproject/reporters-db/workflows/Tests/badge.svg
.. |Version Badge| image:: https://badge.fury.io/py/reporters-db.svg
Background of the Free Law Reporters Database
=============================================
A long, long time ago near a courthouse not too far away, people started
keeping books of every important opinion that was ever written. These
books became known as *reporters* and were generally created by
librarian-types of yore such as `Mr. William
Cranch <https://en.wikipedia.org/wiki/William_Cranch>`__ and `Alex
Dallas <https://en.wikipedia.org/wiki/Alexander_J._Dallas_%28statesman%29>`__.
These people were busy for the next few centuries and created
*thousands* of these books, culminating in what we know today as West's
reporters or as regional reporters like the "Dakota Reports" or the
thoroughly-named, "Synopses of the Decisions of the Supreme Court of
Texas Arising from Restraints by Conscript and Other Military
Authorities (Robards)."
In this repository we've taken a look at all these reporters and tried
to sort out what we know about them and convert that to data. This data
is available as a JSON file, as Python variables, and can be browsed in an
unofficial CSV (it's usually out of date).
Naturally, converting several centuries' history into clean data results
in a mess, but we've done our best and this mess is in use in a number
of projects as listed below. As of version 3.2.32, this data contains information
about 1,167 reporters and 2,102 name variations.
We hope you'll find this useful to your endeavors and that you'll share
your work with the community if you improve or use this work.
Data Sourcing
=============
This project has been enhanced several times with data from several sources:
1. The original data came from parsing the citation fields for millions of cases in CourtListener.
2. A second huge push came from parsing metadata obtained from two major legal publishers, and by parsing the citation fields of Havard's Case.law database.
3. An audit was performed and additional fields were added by using regular expressions to find number-word-number strings in the entire Harvard Case.law database. The results of this were sorted by frequency, with the top omissions fixed.
Along the way, small and subtle improvements have been made as gaps were identified and fixed.
The result is that this database should thus be very complete when it comes to reporter abbreviations and variations. It has the data from CourtListener, two major legal publishers, and Harvard's Case.law. Hundreds of hours have gone into this database to make it complete.
Installation (Python)
=====================
You can install the Free Law Reporters Database with a few simple
commands:
::
pip install reporters-db
Of course, if you're not using Python, the data is in the ``json``
format, so you should be able to import it using your language of
choice. People occasionally play with converting this to other languages, but
no other implementations are presently known.
API
===
Using this database is pretty simple. As this is a database, here are no
public methods or classes, only variables. Importing any of these these
variables loads them all, including loading several JSON files from disk. It is
therefore recommended not to load these variables more than necessary.
The simplest way to understand this data is to simply import these variables
and look at them.
All variables are imported from the package root as follows:
::
from reporters_db import REPORTERS
The available variables are:
- ``REPORTERS`` — This is the main database and contains a huge dict of reporters as described below.
- ``LAWS`` — Our database of statutory abbreviations, mapping the statute abbreviations to their metadata. For example ``Ark. Reg`` is the abbreviation for the ``Arkansas Register``.
- ``JOURNALS`` — Same idea as ``LAWS``, but for legal journal abbreviations.
- ``STATE_ABBREVIATIONS`` — Bluebook style abbreviations for each state. For example, ``Ala.`` for Alaska and ``Haw.`` for Hawaii.
- ``CASE_NAME_ABBREVIATIONS`` — Bluebook style abbreviations for common words, mapping each abbreviation to a list of possible words. For example, ``Admin`` maps to ``["Administrative", "Administration"]``.
A few specialized reporter-related variables are:
- ``VARIATIONS_ONLY`` — This contains a dict mapping a canonical reporter abbreviation to a list of possible variations it could represent. For example, ``A. 2d`` sometimes incorrectly lacks a space, and has a variation list of ``["A.2d"]``. ``P.R.`` could be ``["Pen. & W.", "P.R.R.", "P."]``.
- ``EDITIONS`` — A simple dict to map the abbreviations for each reporter edition to the canonical reporter. For example, ``A.2d`` maps to ``A.``.
- ``NAMES_TO_EDITIONS`` — A simple dict to map the name of a reporter back to its canonilcal abbreviations. For example, ``Atlantic Reporter`` maps to ``['A.', 'A.2d']``.
CSV
===
You can make a CSV of this data by running:
::
make_csv.py
We keep a copy of this CSV in this repository (``reporters.csv``), but
it is not kept up to date. It should, however, provide a good idea of
what's here.
Known Implementations
=====================
1. This work was originally deployed in the
`CourtListener <https://www.courtlistener.com>`__ citation finder
beginning in about 2012. It has been used literally millions of times
to identify citations between cases.
2. An extension for Firefox known as the `Free Law
Ferret <http://citationstylist.org/2013/08/20/free-law-ferret-document-to-cited-cases-in-a-click/>`__
uses this code to find citations in your browser as you read things
-- all over the Web.
3. A Node module called
`Walverine <https://github.com/adelevie/walverine>`__ uses an
iteration of this code to find citations using the V8 JavaScript
engine.
Additional usages can be `found via Github <https://github.com/freelawproject/reporters-db/network/dependents?package_id=UGFja2FnZS01MjU0MTgzNg%3D%3D>`__.
Some Notes on the Data
======================
Some things to bear in mind as you are examining the Free Law Reporters
Database:
1. Each Reporter key maps to a list of reporters that that key can
represent. In some cases (especially in early reporters), the key is
ambiguous, referring to more than one possible reporter.
2. Formats follow the Blue Book standard, with variations listed for
local rules and other ways lawyers abbreviate it over the years or
accidentally.
3. The ``variations`` key consists of data from local rules, found
through organic usage in our corpus and from the `Cardiff Index to
Legal Abbreviations <http://www.legalabbrevs.cardiff.ac.uk/>`__. We
have used a dict for these values due to the fact that there can be
variations for each series.
4. ``mlz_jurisdiction`` corresponds to the work that is being done for
Multi-Lingual Zotero. This field is maintained by Frank Bennett and
may sometimes be missing values.
5. Some reporters have ``href`` or ``notes`` fields to provide a link to
the best available reference (often Wikipedia) or to provide notes
about the reporter itself.
6. Regarding dates of the editions, there are a few things to know. In
reporters with multiple series, if multiple volumes have the same
dates, this indicates that the point where one series ends and the
other begins is unknown. If an edition has 1750 as its start date,
this indicates that the actual start date is unknown. Likewise, if an
edition has ``null`` as its end date, that indicates the actual end
date is either unknown, or it's known that the series has not
completed. These areas need research before we can release version
1.1 of this database. Finally, dates are inclusive, so the first and
last opinions in a reporter series have the same dates as the
database.
A complete data point has fields like so:
::
"$citation": [
{
"cite_type": "state|federal|neutral|specialty|specialty_west|specialty_lexis|state_regional|scotus_early",
"editions": {
"$citation": {
"end": null,
"regexes": [],
"start": "1750-01-01T00:00:00"
},
"$citation 2d": {
"end": null,
"regexes": [],
"start": "1750-01-01T00:00:00"
}
},
"examples": [],
"mlz_jurisdiction": [],
"name": "",
"variations": {},
"notes": "",
"href": "",
"publisher": ""
}
],
The "regexes" field and regexes.json placeholders
-------------------------------------------------
The "regexes" field can contain raw regular expressions to match a custom citation format,
or can contain placeholders to be substituted from ``regexes.json`` using
`python Template formatting <https://docs.python.org/3/library/string.html#template-strings>`__.
If custom regexes are provided, the tests will require that all regexes match at least one
example in ``examples`` and that all examples match at least one regex.
When adding a new regex it can be useful to ``pip install exrex`` and run the tests *without*
adding any examples to get a listing of potential citations that would be matched by the new
regex.
``state_abbreviations`` and ``case_name_abbreviations`` files
-------------------------------------------------------------
1. Abbreviations are based on data from the values in the nineteenth
edition of the Blue Book supplemented with abbreviations found in our
corpus.
2. ``case_name_abbreviations.json`` contains the abbreviations that are
likely to occur in the case name of an opinion.
3. ``state_abbreviations.json`` contains the abbreviations that are
likely to be used to refer to American states.
Notes on Specific Data Point and References
-------------------------------------------
1. A good way to look up abbreviations is in `Prince's Bieber Dictionary
of Legal Abbreviations <https://books.google.com/books?id=4aJsAwAAQBAJ&dq=%22Ohio+Law+Rep.%22&source=gbs_navlinks_s>`__. You can find a lot of this book on Google Books,
but we have it as a PDF too. Just ask.
2. Mississippi supports neutral citations, but does so in their own
format, as specified in `this
rule <http://www.aallnet.org/main-menu/Advocacy/access/citation/neutralrules/rules-ms.html>`__.
Research is needed for the format in ``reporters.json`` to see if it
is used accidentally as a variant of their rule or whether it is an
error in this database.
3. New Mexico dates confirmed via the `table
here <http://www.nmcompcomm.us/nmcases/pdf/NM%20Reports%20to%20Official%20-%20Vols.%201-75.pdf>`__.
4. Both Puerto Rico and "Pennsylvania State Reports, Penrose and Watts"
use the citation "P.R."
Tests
=====
We have a few tests that make sure things haven't completely broken.
They are automatically run by Travis CI each time a push is completed
and should be run by developers as well before pushing. They can be run
with:
::
python tests.py
It's pretty simple, right?
Releases
--------
Update setup.py, add a git tag to the commit with the version number, and push
to master. Be sure you have your tooling set up to push git tags. That's often
not the default. Github Actions will push a release to PyPi if tests pass.
License
=======
This repository is available under the permissive BSD license, making it
easy and safe to incorporate in your own libraries.
Pull and feature requests welcome. Online editing in Github is possible
(and easy!)
Raw data
{
"_id": null,
"home_page": "https://github.com/freelawproject/reporters-db",
"name": "reporters-db",
"maintainer": "Mike Lissner",
"docs_url": null,
"requires_python": null,
"maintainer_email": "mike@free.law",
"keywords": "legal, reporters",
"author": "Mike Lissner",
"author_email": "mike@free.law",
"download_url": "https://files.pythonhosted.org/packages/c1/dd/393b2134f3ff6e2bb9179b68e3752ee7a7c632d909f78f4c841d424d3996/reporters_db-3.2.46.tar.gz",
"platform": null,
"description": "+---------------+---------------------+-------------------+\n| |Lint Badge| | |Test Badge| | |Version Badge| |\n+---------------+---------------------+-------------------+\n\n.. |Lint Badge| image:: https://github.com/freelawproject/reporters-db/workflows/Lint/badge.svg\n.. |Test Badge| image:: https://github.com/freelawproject/reporters-db/workflows/Tests/badge.svg\n.. |Version Badge| image:: https://badge.fury.io/py/reporters-db.svg\n\nBackground of the Free Law Reporters Database\n=============================================\n\nA long, long time ago near a courthouse not too far away, people started\nkeeping books of every important opinion that was ever written. These\nbooks became known as *reporters* and were generally created by\nlibrarian-types of yore such as `Mr. William\nCranch <https://en.wikipedia.org/wiki/William_Cranch>`__ and `Alex\nDallas <https://en.wikipedia.org/wiki/Alexander_J._Dallas_%28statesman%29>`__.\n\nThese people were busy for the next few centuries and created\n*thousands* of these books, culminating in what we know today as West's\nreporters or as regional reporters like the \"Dakota Reports\" or the\nthoroughly-named, \"Synopses of the Decisions of the Supreme Court of\nTexas Arising from Restraints by Conscript and Other Military\nAuthorities (Robards).\"\n\nIn this repository we've taken a look at all these reporters and tried\nto sort out what we know about them and convert that to data. This data\nis available as a JSON file, as Python variables, and can be browsed in an\nunofficial CSV (it's usually out of date).\n\nNaturally, converting several centuries' history into clean data results\nin a mess, but we've done our best and this mess is in use in a number\nof projects as listed below. As of version 3.2.32, this data contains information\nabout 1,167 reporters and 2,102 name variations.\n\nWe hope you'll find this useful to your endeavors and that you'll share\nyour work with the community if you improve or use this work.\n\n\nData Sourcing\n=============\n\nThis project has been enhanced several times with data from several sources:\n\n1. The original data came from parsing the citation fields for millions of cases in CourtListener.\n\n2. A second huge push came from parsing metadata obtained from two major legal publishers, and by parsing the citation fields of Havard's Case.law database.\n\n3. An audit was performed and additional fields were added by using regular expressions to find number-word-number strings in the entire Harvard Case.law database. The results of this were sorted by frequency, with the top omissions fixed.\n\nAlong the way, small and subtle improvements have been made as gaps were identified and fixed.\n\nThe result is that this database should thus be very complete when it comes to reporter abbreviations and variations. It has the data from CourtListener, two major legal publishers, and Harvard's Case.law. Hundreds of hours have gone into this database to make it complete.\n\n\nInstallation (Python)\n=====================\n\nYou can install the Free Law Reporters Database with a few simple\ncommands:\n\n::\n\n pip install reporters-db\n\nOf course, if you're not using Python, the data is in the ``json``\nformat, so you should be able to import it using your language of\nchoice. People occasionally play with converting this to other languages, but\nno other implementations are presently known.\n\n\nAPI\n===\nUsing this database is pretty simple. As this is a database, here are no\npublic methods or classes, only variables. Importing any of these these\nvariables loads them all, including loading several JSON files from disk. It is\ntherefore recommended not to load these variables more than necessary.\n\nThe simplest way to understand this data is to simply import these variables\nand look at them.\n\nAll variables are imported from the package root as follows:\n\n::\n\n from reporters_db import REPORTERS\n\nThe available variables are:\n\n - ``REPORTERS`` \u2014 This is the main database and contains a huge dict of reporters as described below.\n\n - ``LAWS`` \u2014 Our database of statutory abbreviations, mapping the statute abbreviations to their metadata. For example ``Ark. Reg`` is the abbreviation for the ``Arkansas Register``.\n\n - ``JOURNALS`` \u2014 Same idea as ``LAWS``, but for legal journal abbreviations.\n\n - ``STATE_ABBREVIATIONS`` \u2014 Bluebook style abbreviations for each state. For example, ``Ala.`` for Alaska and ``Haw.`` for Hawaii.\n\n - ``CASE_NAME_ABBREVIATIONS`` \u2014 Bluebook style abbreviations for common words, mapping each abbreviation to a list of possible words. For example, ``Admin`` maps to ``[\"Administrative\", \"Administration\"]``.\n\nA few specialized reporter-related variables are:\n\n - ``VARIATIONS_ONLY`` \u2014 This contains a dict mapping a canonical reporter abbreviation to a list of possible variations it could represent. For example, ``A. 2d`` sometimes incorrectly lacks a space, and has a variation list of ``[\"A.2d\"]``. ``P.R.`` could be ``[\"Pen. & W.\", \"P.R.R.\", \"P.\"]``.\n\n - ``EDITIONS`` \u2014 A simple dict to map the abbreviations for each reporter edition to the canonical reporter. For example, ``A.2d`` maps to ``A.``.\n\n - ``NAMES_TO_EDITIONS`` \u2014 A simple dict to map the name of a reporter back to its canonilcal abbreviations. For example, ``Atlantic Reporter`` maps to ``['A.', 'A.2d']``.\n\n\nCSV\n===\n\nYou can make a CSV of this data by running:\n\n::\n\n make_csv.py\n\nWe keep a copy of this CSV in this repository (``reporters.csv``), but\nit is not kept up to date. It should, however, provide a good idea of\nwhat's here.\n\n\nKnown Implementations\n=====================\n\n1. This work was originally deployed in the\n `CourtListener <https://www.courtlistener.com>`__ citation finder\n beginning in about 2012. It has been used literally millions of times\n to identify citations between cases.\n\n2. An extension for Firefox known as the `Free Law\n Ferret <http://citationstylist.org/2013/08/20/free-law-ferret-document-to-cited-cases-in-a-click/>`__\n uses this code to find citations in your browser as you read things\n -- all over the Web.\n\n3. A Node module called\n `Walverine <https://github.com/adelevie/walverine>`__ uses an\n iteration of this code to find citations using the V8 JavaScript\n engine.\n\nAdditional usages can be `found via Github <https://github.com/freelawproject/reporters-db/network/dependents?package_id=UGFja2FnZS01MjU0MTgzNg%3D%3D>`__.\n\n\nSome Notes on the Data\n======================\n\nSome things to bear in mind as you are examining the Free Law Reporters\nDatabase:\n\n1. Each Reporter key maps to a list of reporters that that key can\n represent. In some cases (especially in early reporters), the key is\n ambiguous, referring to more than one possible reporter.\n\n2. Formats follow the Blue Book standard, with variations listed for\n local rules and other ways lawyers abbreviate it over the years or\n accidentally.\n\n3. The ``variations`` key consists of data from local rules, found\n through organic usage in our corpus and from the `Cardiff Index to\n Legal Abbreviations <http://www.legalabbrevs.cardiff.ac.uk/>`__. We\n have used a dict for these values due to the fact that there can be\n variations for each series.\n\n4. ``mlz_jurisdiction`` corresponds to the work that is being done for\n Multi-Lingual Zotero. This field is maintained by Frank Bennett and\n may sometimes be missing values.\n\n5. Some reporters have ``href`` or ``notes`` fields to provide a link to\n the best available reference (often Wikipedia) or to provide notes\n about the reporter itself.\n\n6. Regarding dates of the editions, there are a few things to know. In\n reporters with multiple series, if multiple volumes have the same\n dates, this indicates that the point where one series ends and the\n other begins is unknown. If an edition has 1750 as its start date,\n this indicates that the actual start date is unknown. Likewise, if an\n edition has ``null`` as its end date, that indicates the actual end\n date is either unknown, or it's known that the series has not\n completed. These areas need research before we can release version\n 1.1 of this database. Finally, dates are inclusive, so the first and\n last opinions in a reporter series have the same dates as the\n database.\n\n\nA complete data point has fields like so:\n\n::\n\n \"$citation\": [\n {\n \"cite_type\": \"state|federal|neutral|specialty|specialty_west|specialty_lexis|state_regional|scotus_early\",\n \"editions\": {\n \"$citation\": {\n \"end\": null,\n \"regexes\": [],\n \"start\": \"1750-01-01T00:00:00\"\n },\n \"$citation 2d\": {\n \"end\": null,\n \"regexes\": [],\n \"start\": \"1750-01-01T00:00:00\"\n }\n },\n \"examples\": [],\n \"mlz_jurisdiction\": [],\n \"name\": \"\",\n \"variations\": {},\n \"notes\": \"\",\n \"href\": \"\",\n \"publisher\": \"\"\n }\n ],\n\nThe \"regexes\" field and regexes.json placeholders\n-------------------------------------------------\n\nThe \"regexes\" field can contain raw regular expressions to match a custom citation format,\nor can contain placeholders to be substituted from ``regexes.json`` using\n`python Template formatting <https://docs.python.org/3/library/string.html#template-strings>`__.\n\nIf custom regexes are provided, the tests will require that all regexes match at least one\nexample in ``examples`` and that all examples match at least one regex.\n\nWhen adding a new regex it can be useful to ``pip install exrex`` and run the tests *without*\nadding any examples to get a listing of potential citations that would be matched by the new\nregex.\n\n\n``state_abbreviations`` and ``case_name_abbreviations`` files\n-------------------------------------------------------------\n\n1. Abbreviations are based on data from the values in the nineteenth\n edition of the Blue Book supplemented with abbreviations found in our\n corpus.\n2. ``case_name_abbreviations.json`` contains the abbreviations that are\n likely to occur in the case name of an opinion.\n3. ``state_abbreviations.json`` contains the abbreviations that are\n likely to be used to refer to American states.\n\nNotes on Specific Data Point and References\n-------------------------------------------\n\n1. A good way to look up abbreviations is in `Prince's Bieber Dictionary\n of Legal Abbreviations <https://books.google.com/books?id=4aJsAwAAQBAJ&dq=%22Ohio+Law+Rep.%22&source=gbs_navlinks_s>`__. You can find a lot of this book on Google Books,\n but we have it as a PDF too. Just ask.\n\n2. Mississippi supports neutral citations, but does so in their own\n format, as specified in `this\n rule <http://www.aallnet.org/main-menu/Advocacy/access/citation/neutralrules/rules-ms.html>`__.\n Research is needed for the format in ``reporters.json`` to see if it\n is used accidentally as a variant of their rule or whether it is an\n error in this database.\n\n3. New Mexico dates confirmed via the `table\n here <http://www.nmcompcomm.us/nmcases/pdf/NM%20Reports%20to%20Official%20-%20Vols.%201-75.pdf>`__.\n\n4. Both Puerto Rico and \"Pennsylvania State Reports, Penrose and Watts\"\n use the citation \"P.R.\"\n\n\nTests\n=====\n\nWe have a few tests that make sure things haven't completely broken.\nThey are automatically run by Travis CI each time a push is completed\nand should be run by developers as well before pushing. They can be run\nwith:\n\n::\n\n python tests.py\n\nIt's pretty simple, right?\n\n\nReleases\n--------\n\nUpdate setup.py, add a git tag to the commit with the version number, and push\nto master. Be sure you have your tooling set up to push git tags. That's often\nnot the default. Github Actions will push a release to PyPi if tests pass.\n\n\nLicense\n=======\n\nThis repository is available under the permissive BSD license, making it\neasy and safe to incorporate in your own libraries.\n\nPull and feature requests welcome. Online editing in Github is possible\n(and easy!)\n",
"bugtrack_url": null,
"license": "BSD",
"summary": "Database of Court Reporters",
"version": "3.2.46",
"project_urls": {
"Homepage": "https://github.com/freelawproject/reporters-db"
},
"split_keywords": [
"legal",
" reporters"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "20f4c1b2a6050e3612b455c79cbf6df2e000c8e3a5c266cfc9d0c352711e958e",
"md5": "2182e037b0bef3034d7719773469e851",
"sha256": "a8c0b0212220af099714b92938d4ffe2f48e21cde6510a1141cc9dcd86f9047f"
},
"downloads": -1,
"filename": "reporters_db-3.2.46-py2.py3-none-any.whl",
"has_sig": false,
"md5_digest": "2182e037b0bef3034d7719773469e851",
"packagetype": "bdist_wheel",
"python_version": "py2.py3",
"requires_python": null,
"size": 171009,
"upload_time": "2024-12-04T19:39:19",
"upload_time_iso_8601": "2024-12-04T19:39:19.255898Z",
"url": "https://files.pythonhosted.org/packages/20/f4/c1b2a6050e3612b455c79cbf6df2e000c8e3a5c266cfc9d0c352711e958e/reporters_db-3.2.46-py2.py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "c1dd393b2134f3ff6e2bb9179b68e3752ee7a7c632d909f78f4c841d424d3996",
"md5": "0342e6df2eaa2361c21f2ffd30cb64b1",
"sha256": "105bbab035912e3eea95059a9d7b1ee5d2941e19a07b9dd3856e9d80156d87a9"
},
"downloads": -1,
"filename": "reporters_db-3.2.46.tar.gz",
"has_sig": false,
"md5_digest": "0342e6df2eaa2361c21f2ffd30cb64b1",
"packagetype": "sdist",
"python_version": "source",
"requires_python": null,
"size": 171209,
"upload_time": "2024-12-04T19:39:21",
"upload_time_iso_8601": "2024-12-04T19:39:21.179244Z",
"url": "https://files.pythonhosted.org/packages/c1/dd/393b2134f3ff6e2bb9179b68e3752ee7a7c632d909f78f4c841d424d3996/reporters_db-3.2.46.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-12-04 19:39:21",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "freelawproject",
"github_project": "reporters-db",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"requirements": [
{
"name": "six",
"specs": [
[
">=",
"1.0.0"
]
]
}
],
"tox": true,
"lcname": "reporters-db"
}