# Inference system for hierarchical text classification
Allows using a cascade of text classifiers based on a given hierarchy of language models.
This python package has been developed for the IntelComp project, as part of deliverable 3.4.
## Setup
```commandline
pip install clf-inference-intelcomp
```
## Usage
Toy example commands:
```python
from clf_inference_intelcomp import Classification
clf = Classification()
res = clf.classify(taxonomy="ipc", text="This is a test")
clf.classify("fos", "This is another test")
```
Output:
```commandline
Running IPC classification at level 0... (model: intelcomp/ipc_level0)
Running IPC classification at level 1... (model: intelcomp/ipc_level1_G)
####### LEVEL 0:
- Class: G
- Full class name: PHYSICS
- Confidence score: 0.9405
####### LEVEL 1:
- Class: 09
- Full class name: EDUCATING; CRYPTOGRAPHY; DISPLAY; ADVERTISING; SEALS
- Confidence score: 0.3386
Running FOS classification at level 0... (model: intelcomp/fos_level0)
####### LEVEL 0:
- Class: Medicine
- Full class name: None
- Confidence score: 0.5535
{0: ('Medicine', None, 0.5535141229629517)}
```
As it can be seen, the `classify()` method returns a dictionary object with the taxonomy levels as keys,
where the values are triplets that contain the classification output.
Additionally, the method `classify_batch()` can be used in a similar manner to perform classification by batches,
giving it a list of texts rather than a single string to classify. Type `help(clf.classify_batch)` for more details.
## Customizable attributes
After instantiating an object from the Classification class, you can access the following attributes:
- `WORKING_DIR`: Location where the package files are located. Change only if you want to read YAML files from somewhere else.
- `CACHE_DIR`: Path where models will be cached for faster inference in subsequent calls. Defaults to `~/.cache/huggingface/intelcomp`.
- `DEVICE`: Either "cuda" or "cpu", based on the availability of GPUs.
- `models`: Dictionary with the paths to the different models, hierarchically organized.
- `classes`: Dictionary to map class codes to class names, hierarchically organized as well.
- `avail_taxonomies`: List of taxonomies available in `data/models.yaml`. Notice that this is a private attribute that
cannot be directly edited by the user, although it can be updated after modifying the YAML files.
## Caching
By default, models are downloaded on-the-fly at inference time. This means that whenever a model is used for the first
time, it will have to be downloaded first. In subsequent calls, inference will be faster since the model will already be
loaded in memory. For a smooth and uninterrupted inference, it is possible to load all models at once (or only the ones
belonging to a specific taxonomy) with the function `cache_models()`. By doing this, the desired models will be cached
in the directory specified by the class attribute `CACHE_DIR`, which can be modified if needed.
For instance, to classify models according to the IPC taxonomy, all IPC models can be loaded beforehand as follows:
```python
from clf_inference_intelcomp import Classification
clf = Classification()
clf.CACHE_DIR = "~/Downloads/intelcomp/models"
clf.cache_models("ipc")
res = clf.classify("ipc", "This will not wait for model downloads.")
```
Similarly, in order to load ALL available models, no matter the taxonomy, simply call `clf.cache_models()` with no arguments.
## Adding/Removing new taxonomies
To add and/or remove new taxonomies, the YAML files within the `data` directory must be updated accordingly.
It is important to always follow the original structure, which is showcased in the next section.
The Classification object will be initialized based on the YAML files content at creation time, but if such files are
modified afterwards, it is possible to update the taxonomy details by simply calling the `update_taxonomies()` method.
```python
from clf_inference_intelcomp import Classification
clf = Classification()
print(clf.avail_taxonomies) # ['fos', 'ipc', 'nace2']
# YAML FILES BEING MODIFIED...
clf.update_taxonomies()
print(clf.avail_taxonomies) # ['fos', 'ipc', 'nace2', 'NEW_TAXONOMY']
```
## YAML files
The `data` directory contains 2 YAML files that can be edited by the user to add new taxonomies:
### `models.yaml`:
Lists the models to be used for each taxonomy, with a different indentation for each of its levels. \
Can either contain the names of models from the HuggingFace Hub or paths to local repositories. \
The first level (i.e. 0) of any taxonomy only contains a single model, which would be the root node of the
classification tree, while deeper levels have a list of models for each possible outcome from the prior level.
This is why, for the non-zero levels, it is important to make sure that the model key (value preceding ":")
matches some label from the previous model in the chain.
### `classes.yaml`:
Lists the possible outputs of all classifiers from `models.yaml`, following the same format. \
It basically maps class codes to their corresponding names, for a better interpretability of the results.
**Note**: \
It is important to ensure that the class names (keys of inner dictionaries) are read as strings. \
The reason is that some taxonomies (e.g. IPC) have numeric codes that are preceded by a number zero (e.g. "01"). \
In these cases, YAML reads the key as an integer and, thus, the preceding zero gets lost.
The library makes sure to convert all keys to strings for consistency, but these specific cases with leading zeros
are hard to handle since there is no way to tell if there was a zero or not in the original file. \
In order to prevent that, please make sure to surround this kind of numeric codes with double quotes.
### Examples
Below are the default YAML files, for reference. Notice how no quotation marks are required at all in `models.yaml`,
since none of the taxonomies contains "problematic" codes and the function used to process YAML files already reads all
keys as strings.
```yaml
# data/models.yaml
fos:
0: intelcomp/fos_level0
ipc:
0: intelcomp/ipc_level0
1:
A: intelcomp/ipc_level1_A
B: intelcomp/ipc_level1_B
C: intelcomp/ipc_level1_C
D: intelcomp/ipc_level1_D
E: intelcomp/ipc_level1_E
F: intelcomp/ipc_level1_F
G: intelcomp/ipc_level1_G
H: intelcomp/ipc_level1_H
nace2:
0: intelcomp/nace2_level0
1:
20: intelcomp/nace2_level1_20
25: intelcomp/nace2_level1_25
26: intelcomp/nace2_level1_26
27: intelcomp/nace2_level1_27
28: intelcomp/nace2_level1_28
29: intelcomp/nace2_level1_29
42: intelcomp/nace2_level1_42
```
```yaml
#data/classes.yaml
fos:
0:
0: ART
1: BIOLOGY
2: BUSINESS
3: CHEMISTRY
4: COMPUTER SCIENCE
5: ECONOMICS
6: ENGINEERING
7: ENVIRONMENTAL SCIENCE
8: GEOGRAPHY
9: GEOLOGY
10: HISTORY
11: MATERIALS SCIENCE
12: MATHEMATICS
13: MEDICINE
14: PHILOSOPHY
15: PHYSICS
16: POLITICAL SCIENCE
17: PSYCHOLOGY
18: SOCIOLOGY
ipc:
0:
A: HUMAN NECESSITIES
B: PERFORMING OPERATIONS, TRANSPORTING
C: CHEMISTRY, METALLURGY
D: TEXTILES, PAPER
E: FIXED CONSTRUCTIONS
F: MECHANICAL ENGINEERING, LIGHTING, HEATING, WEAPONS
G: PHYSICS
H: ELECTRICITY
1:
A:
"01": AGRICULTURE; FORESTRY; ANIMAL HUSBANDRY; HUNTING; TRAPPING; FISHING
"21": BAKING; EQUIPMENT FOR MAKING OR PROCESSING DOUGHS; DOUGHS FOR BAKING [2006.01]
"22": BUTCHERING; MEAT TREATMENT; PROCESSING POULTRY OR FISH
"23": FOODS OR FOODSTUFFS; TREATMENT THEREOF, NOT COVERED BY OTHER CLASSES
"24": TOBACCO; CIGARS; CIGARETTES; SIMULATED SMOKING DEVICES; SMOKERS' REQUISITES
"41": WEARING APPAREL
"42": HEADWEAR
"43": FOOTWEAR
"44": HABERDASHERY; JEWELLERY
"45": HAND OR TRAVELLING ARTICLES
"46": BRUSHWARE
"47": FURNITURE; DOMESTIC ARTICLES OR APPLIANCES; COFFEE MILLS; SPICE MILLS; SUCTION CLEANERS IN GENERAL
"61": MEDICAL OR VETERINARY SCIENCE; HYGIENE
"62": LIFE-SAVING; FIRE-FIGHTING
"63": SPORTS; GAMES; AMUSEMENTS
"99": SUBJECT MATTER NOT OTHERWISE PROVIDED FOR IN THIS SECTION [2006.01]
B:
"01": PHYSICAL OR CHEMICAL PROCESSES OR APPARATUS IN GENERAL
"02": CRUSHING, PULVERISING, OR DISINTEGRATING; PREPARATORY TREATMENT OF GRAIN FOR MILLING
"03": SEPARATION OF SOLID MATERIALS USING LIQUIDS OR USING PNEUMATIC TABLES OR JIGS; MAGNETIC OR ELECTROSTATIC SEPARATION OF SOLID MATERIALS FROM SOLID MATERIALS OR FLUIDS; SEPARATION BY HIGH-VOLTAGE ELECTRIC FIELDS [5]
"04": CENTRIFUGAL APPARATUS OR MACHINES FOR CARRYING-OUT PHYSICAL OR CHEMICAL PROCESSES
"05": SPRAYING OR ATOMISING IN GENERAL; APPLYING FLUENT MATERIALS TO SURFACES, IN GENERAL [2]
"06": GENERATING OR TRANSMITTING MECHANICAL VIBRATIONS IN GENERAL
"07": SEPARATING SOLIDS FROM SOLIDS; SORTING
"08": CLEANING
"09": DISPOSAL OF SOLID WASTE; RECLAMATION OF CONTAMINATED SOIL [6]
"21": MECHANICAL METAL-WORKING WITHOUT ESSENTIALLY REMOVING MATERIAL; PUNCHING METAL
"22": CASTING; POWDER METALLURGY
"23": MACHINE TOOLS; METAL-WORKING NOT OTHERWISE PROVIDED FOR
"24": GRINDING; POLISHING
"25": HAND TOOLS; PORTABLE POWER-DRIVEN TOOLS; HANDLES FOR HAND IMPLEMENTS; WORKSHOP EQUIPMENT; MANIPULATORS
"26": HAND CUTTING TOOLS; CUTTING; SEVERING
"27": WORKING OR PRESERVING WOOD OR SIMILAR MATERIAL; NAILING OR STAPLING MACHINES IN GENERAL
"28": WORKING CEMENT, CLAY, OR STONE
"29": WORKING OF PLASTICS; WORKING OF SUBSTANCES IN A PLASTIC STATE IN GENERAL
"30": PRESSES
"31": MAKING ARTICLES OF PAPER, CARDBOARD OR MATERIAL WORKED IN A MANNER ANALOGOUS TO PAPER; WORKING PAPER, CARDBOARD OR MATERIAL WORKED IN A MANNER ANALOGOUS TO PAPER
"32": LAYERED PRODUCTS
"33": ADDITIVE MANUFACTURING TECHNOLOGY [2015.01]
"41": PRINTING; LINING MACHINES; TYPEWRITERS; STAMPS [4]
"42": BOOKBINDING; ALBUMS; FILES; SPECIAL PRINTED MATTER
"43": WRITING OR DRAWING IMPLEMENTS; BUREAU ACCESSORIES
"44": DECORATIVE ARTS
"60": VEHICLES IN GENERAL
"61": RAILWAYS
"62": LAND VEHICLES FOR TRAVELLING OTHERWISE THAN ON RAILS
"63": SHIPS OR OTHER WATERBORNE VESSELS; RELATED EQUIPMENT
"64": AIRCRAFT; AVIATION; COSMONAUTICS
"65": CONVEYING; PACKING; STORING; HANDLING THIN OR FILAMENTARY MATERIAL
"66": HOISTING; LIFTING; HAULING
"67": OPENING OR CLOSING BOTTLES, JARS OR SIMILAR CONTAINERS; LIQUID HANDLING
"68": SADDLERY; UPHOLSTERY
"81": MICROSTRUCTURAL TECHNOLOGY [7]
"82": NANOTECHNOLOGY [7]
"99": SUBJECT MATTER NOT OTHERWISE PROVIDED FOR IN THIS SECTION [2006.01]
C:
"01": INORGANIC CHEMISTRY
"02": TREATMENT OF WATER, WASTE WATER, SEWAGE, OR SLUDGE
"03": GLASS; MINERAL OR SLAG WOOL
"04": CEMENTS; CONCRETE; ARTIFICIAL STONE; CERAMICS; REFRACTORIES [4]
"05": FERTILISERS; MANUFACTURE THEREOF [4]
"06": EXPLOSIVES; MATCHES
"07": ORGANIC CHEMISTRY [2]
"08": ORGANIC MACROMOLECULAR COMPOUNDS; THEIR PREPARATION OR CHEMICAL WORKING-UP; COMPOSITIONS BASED THEREON
"09": DYES; PAINTS; POLISHES; NATURAL RESINS; ADHESIVES; COMPOSITIONS NOT OTHERWISE PROVIDED FOR; APPLICATIONS OF MATERIALS NOT OTHERWISE PROVIDED FOR
"10": PETROLEUM, GAS OR COKE INDUSTRIES; TECHNICAL GASES CONTAINING CARBON MONOXIDE; FUELS; LUBRICANTS; PEAT
"11": ANIMAL OR VEGETABLE OILS, FATS, FATTY SUBSTANCES OR WAXES; FATTY ACIDS THEREFROM; DETERGENTS; CANDLES
"12": BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
"13": SUGAR INDUSTRY [4]
"14": SKINS; HIDES; PELTS OR LEATHER
"21": METALLURGY OF IRON
"22": METALLURGY; FERROUS OR NON-FERROUS ALLOYS; TREATMENT OF ALLOYS OR NON-FERROUS METALS
"23": COATING METALLIC MATERIAL; COATING MATERIAL WITH METALLIC MATERIAL; CHEMICAL SURFACE TREATMENT; DIFFUSION TREATMENT OF METALLIC MATERIAL; COATING BY VACUUM EVAPORATION, BY SPUTTERING, BY ION IMPLANTATION OR BY CHEMICAL VAPOUR DEPOSITION, IN GENERAL; INHIBITING CORROSION OF METALLIC MATERIAL OR INCRUSTATION IN GENERAL [2]
"25": ELECTROLYTIC OR ELECTROPHORETIC PROCESSES; APPARATUS THEREFOR [4]
"30": CRYSTAL GROWTH [3]
"40": COMBINATORIAL TECHNOLOGY [2006.01]
"99": SUBJECT MATTER NOT OTHERWISE PROVIDED FOR IN THIS SECTION [2006.01]
D:
"01": NATURAL OR MAN-MADE THREADS OR FIBRES; SPINNING
"02": YARNS; MECHANICAL FINISHING OF YARNS OR ROPES; WARPING OR BEAMING
"03": WEAVING
"04": BRAIDING; LACE-MAKING; KNITTING; TRIMMINGS; NON-WOVEN FABRICS
"05": SEWING; EMBROIDERING; TUFTING
"06": TREATMENT OF TEXTILES OR THE LIKE; LAUNDERING; FLEXIBLE MATERIALS NOT OTHERWISE PROVIDED FOR
"07": ROPES; CABLES OTHER THAN ELECTRIC
"21": PAPER-MAKING; PRODUCTION OF CELLULOSE
"99": SUBJECT MATTER NOT OTHERWISE PROVIDED FOR IN THIS SECTION [2006.01]
E:
"01": CONSTRUCTION OF ROADS, RAILWAYS, OR BRIDGES
"02": HYDRAULIC ENGINEERING; FOUNDATIONS; SOIL-SHIFTING
"03": WATER SUPPLY; SEWERAGE
"04": BUILDING
"05": LOCKS; KEYS; WINDOW OR DOOR FITTINGS; SAFES
"06": DOORS, WINDOWS, SHUTTERS, OR ROLLER BLINDS, IN GENERAL; LADDERS
"21": EARTH OR ROCK DRILLING; MINING
"99": SUBJECT MATTER NOT OTHERWISE PROVIDED FOR IN THIS SECTION [2006.01]
F:
"01": MACHINES OR ENGINES IN GENERAL; ENGINE PLANTS IN GENERAL; STEAM ENGINES
"02": COMBUSTION ENGINES; HOT-GAS OR COMBUSTION-PRODUCT ENGINE PLANTS
"03": MACHINES OR ENGINES FOR LIQUIDS; WIND, SPRING, OR WEIGHT MOTORS; PRODUCING MECHANICAL POWER OR A REACTIVE PROPULSIVE THRUST, NOT OTHERWISE PROVIDED FOR
"04": POSITIVE-DISPLACEMENT MACHINES FOR LIQUIDS; PUMPS FOR LIQUIDS OR ELASTIC FLUIDS
"15": FLUID-PRESSURE ACTUATORS; HYDRAULICS OR PNEUMATICS IN GENERAL
"16": ENGINEERING ELEMENTS OR UNITS; GENERAL MEASURES FOR PRODUCING AND MAINTAINING EFFECTIVE FUNCTIONING OF MACHINES OR INSTALLATIONS; THERMAL INSULATION IN GENERAL
"17": STORING OR DISTRIBUTING GASES OR LIQUIDS
"21": LIGHTING
"22": STEAM GENERATION
"23": COMBUSTION APPARATUS; COMBUSTION PROCESSES
"24": HEATING; RANGES; VENTILATING
"25": REFRIGERATION OR COOLING; COMBINED HEATING AND REFRIGERATION SYSTEMS; HEAT PUMP SYSTEMS; MANUFACTURE OR STORAGE OF ICE; LIQUEFACTION OR SOLIDIFICATION OF GASES
"26": DRYING
"27": FURNACES; KILNS, OVENS OR RETORTS [4]
"28": HEAT EXCHANGE IN GENERAL
"41": WEAPONS
"42": AMMUNITION; BLASTING
"99": SUBJECT MATTER NOT OTHERWISE PROVIDED FOR IN THIS SECTION [2006.01]
G:
"01": MEASURING; TESTING
"02": OPTICS
"03": PHOTOGRAPHY; CINEMATOGRAPHY; ANALOGOUS TECHNIQUES USING WAVES OTHER THAN OPTICAL WAVES; ELECTROGRAPHY; HOLOGRAPHY [4]
"04": HOROLOGY
"05": CONTROLLING; REGULATING
"06": COMPUTING; CALCULATING OR COUNTING
"07": CHECKING-DEVICES
"08": SIGNALLING
"09": EDUCATING; CRYPTOGRAPHY; DISPLAY; ADVERTISING; SEALS
"10": MUSICAL INSTRUMENTS; ACOUSTICS
"11": INFORMATION STORAGE
"12": INSTRUMENT DETAILS
"16": INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS [2018.01]
"21": NUCLEAR PHYSICS; NUCLEAR ENGINEERING
"99": SUBJECT MATTER NOT OTHERWISE PROVIDED FOR IN THIS SECTION [2006.01]
H:
"01": BASIC ELECTRIC ELEMENTS
"02": GENERATION, CONVERSION, OR DISTRIBUTION OF ELECTRIC POWER
"03": BASIC ELECTRONIC CIRCUITRY
"04": ELECTRIC COMMUNICATION TECHNIQUE
"05": ELECTRIC TECHNIQUES NOT OTHERWISE PROVIDED FOR
"99": SUBJECT MATTER NOT OTHERWISE PROVIDED FOR IN THIS SECTION [2006.01]
nace2:
0:
10: MANUFACTURE OF FOOD PRODUCTS
11: MANUFACTURE OF BEVERAGES
12: MANUFACTURE OF TOBACCO PRODUCTS
13: MANUFACTURE OF TEXTILES
14: MANUFACTURE OF WEARING APPAREL
15: MANUFACTURE OF LEATHER AND RELATED PRODUCTS
16: MANUFACTURE OF WOOD AND OF PRODUCTS OF WOOD AND CORK, EXCEPT FURNITURE; MANUFACTURE OF ARTICLES OF STRAW AND PLAITING MATERIALS
17: MANUFACTURE OF PAPER AND PAPER PRODUCTS
18: PRINTING AND REPRODUCTION OF RECORDED MEDIA
19: MANUFACTURE OF COKE AND REFINED PETROLEUM PRODUCTS
20: MANUFACTURE OF CHEMICALS AND CHEMICAL PRODUCTS
21: MANUFACTURE OF BASIC PHARMACEUTICAL PRODUCTS AND PHARMACEUTICAL PREPARATIONS
22: MANUFACTURE OF RUBBER AND PLASTIC PRODUCTS
23: MANUFACTURE OF OTHER NON-METALLIC MINERAL PRODUCTS
24: MANUFACTURE OF BASIC METALS
25: MANUFACTURE OF FABRICATED METAL PRODUCTS, EXCEPT MACHINERY AND EQUIPMENT
26: MANUFACTURE OF COMPUTER, ELECTRONIC AND OPTICAL PRODUCTS
27: MANUFACTURE OF ELECTRICAL EQUIPMENT
28: MANUFACTURE OF MACHINERY AND EQUIPMENT N.E.C.
29: MANUFACTURE OF MOTOR VEHICLES, TRAILERS AND SEMI-TRAILERS
30: MANUFACTURE OF OTHER TRANSPORT EQUIPMENT
31: MANUFACTURE OF FURNITURE
32: OTHER MANUFACTURING
42: CIVIL ENGINEERING
43: SPECIALISED CONSTRUCTION ACTIVITIES
62: COMPUTER PROGRAMMING, CONSULTANCY AND RELATED ACTIVITIES
1:
20:
1: MANUFACTURE OF BASIC CHEMICALS, FERTILISERS AND NITROGEN COMPOUNDS, PLASTICS AND SYNTHETIC RUBBER IN PRIMARY FORMS
2: MANUFACTURE OF PESTICIDES AND OTHER AGROCHEMICAL PRODUCTS
3: MANUFACTURE OF PAINTS, VARNISHES AND SIMILAR COATINGS, PRINTING INK AND MASTICS
4: MANUFACTURE OF SOAP AND DETERGENTS, CLEANING AND POLISHING PREPARATIONS, PERFUMES AND TOILET PREPARATIONS
5: MANUFACTURE OF OTHER CHEMICAL PRODUCTS
6: MANUFACTURE OF MAN-MADE FIBRES
25:
1: MANUFACTURE OF STRUCTURAL METAL PRODUCTS
2: MANUFACTURE OF TANKS, RESERVOIRS AND CONTAINERS OF METAL
3: MANUFACTURE OF STEAM GENERATORS, EXCEPT CENTRAL HEATING HOT WATER BOILERS
4: MANUFACTURE OF WEAPONS AND AMMUNITION
5: FORGING, PRESSING, STAMPING AND ROLL-FORMING OF METAL; POWDER METALLURGY
6: TREATMENT AND COATING OF METALS; MACHINING
7: MANUFACTURE OF CUTLERY, TOOLS AND GENERAL HARDWARE
9: MANUFACTURE OF OTHER FABRICATED METAL PRODUCTS
26:
1: MANUFACTURE OF ELECTRONIC COMPONENTS AND BOARDS
2: MANUFACTURE OF COMPUTERS AND PERIPHERAL EQUIPMENT
3: MANUFACTURE OF COMPUTERS AND PERIPHERAL EQUIPMENT
4: MANUFACTURE OF CONSUMER ELECTRONICS
5: MANUFACTURE OF INSTRUMENTS AND APPLIANCES FOR MEASURING, TESTING AND NAVIGATION; WATCHES AND CLOCKS
6: MANUFACTURE OF IRRADIATION, ELECTROMEDICAL AND ELECTROTHERAPEUTIC EQUIPMENT
7: MANUFACTURE OF OPTICAL INSTRUMENTS AND PHOTOGRAPHIC EQUIPMENT
8: MANUFACTURE OF MAGNETIC AND OPTICAL MEDIA
27:
1: MANUFACTURE OF ELECTRIC MOTORS, GENERATORS, TRANSFORMERS AND ELECTRICITY DISTRIBUTION AND CONTROL APPARATUS
2: MANUFACTURE OF BATTERIES AND ACCUMULATORS
3: MANUFACTURE OF WIRING AND WIRING DEVICES
4: MANUFACTURE OF ELECTRIC LIGHTING EQUIPMENT
5: MANUFACTURE OF DOMESTIC APPLIANCES
9: MANUFACTURE OF OTHER ELECTRICAL EQUIPMENT
28:
1: MANUFACTURE OF GENERAL — PURPOSE MACHINERY
2: MANUFACTURE OF OTHER GENERAL-PURPOSE MACHINERY
3: MANUFACTURE OF AGRICULTURAL AND FORESTRY MACHINERY
4: MANUFACTURE OF METAL FORMING MACHINERY AND MACHINE TOOLS
9: MANUFACTURE OF OTHER SPECIAL-PURPOSE MACHINERY
29:
1: MANUFACTURE OF MOTOR VEHICLES
3: MANUFACTURE OF PARTS AND ACCESSORIES FOR MOTOR VEHICLES
42:
2: CONSTRUCTION OF UTILITY PROJECTS
9: CONSTRUCTION OF OTHER CIVIL ENGINEERING PROJECTS
```
Raw data
{
"_id": null,
"home_page": "https://github.com/IntelCompH2020",
"name": "clf-inference-intelcomp",
"maintainer": "",
"docs_url": null,
"requires_python": "",
"maintainer_email": "",
"keywords": "intelcomp,NLP,text classification,hierarchical classification,taxonomy learning",
"author": "Marc P\u00e0mies & Joan Llop",
"author_email": "langtech@bsc.es",
"download_url": "https://files.pythonhosted.org/packages/79/e6/f25fe54f98688c985b9acdb14a109ef2214fdbceab59b6ed35defd59da7e/clf_inference_intelcomp-0.1.6.tar.gz",
"platform": "any",
"description": "# Inference system for hierarchical text classification\n\nAllows using a cascade of text classifiers based on a given hierarchy of language models.\n\nThis python package has been developed for the IntelComp project, as part of deliverable 3.4.\n\n## Setup\n```commandline\npip install clf-inference-intelcomp\n```\n\n## Usage\nToy example commands:\n```python\nfrom clf_inference_intelcomp import Classification\n\nclf = Classification()\nres = clf.classify(taxonomy=\"ipc\", text=\"This is a test\")\nclf.classify(\"fos\", \"This is another test\")\n```\nOutput:\n```commandline\nRunning IPC classification at level 0... (model: intelcomp/ipc_level0)\nRunning IPC classification at level 1... (model: intelcomp/ipc_level1_G)\n####### LEVEL 0:\n - Class: G\n - Full class name: PHYSICS\n - Confidence score: 0.9405\n####### LEVEL 1:\n - Class: 09\n - Full class name: EDUCATING; CRYPTOGRAPHY; DISPLAY; ADVERTISING; SEALS\n - Confidence score: 0.3386\n\n\nRunning FOS classification at level 0... (model: intelcomp/fos_level0)\n####### LEVEL 0:\n - Class: Medicine\n - Full class name: None\n - Confidence score: 0.5535\n{0: ('Medicine', None, 0.5535141229629517)}\n```\nAs it can be seen, the `classify()` method returns a dictionary object with the taxonomy levels as keys, \nwhere the values are triplets that contain the classification output.\n\nAdditionally, the method `classify_batch()` can be used in a similar manner to perform classification by batches,\ngiving it a list of texts rather than a single string to classify. Type `help(clf.classify_batch)` for more details.\n\n## Customizable attributes\n\nAfter instantiating an object from the Classification class, you can access the following attributes:\n\n- `WORKING_DIR`: Location where the package files are located. Change only if you want to read YAML files from somewhere else.\n- `CACHE_DIR`: Path where models will be cached for faster inference in subsequent calls. Defaults to `~/.cache/huggingface/intelcomp`.\n- `DEVICE`: Either \"cuda\" or \"cpu\", based on the availability of GPUs.\n- `models`: Dictionary with the paths to the different models, hierarchically organized. \n- `classes`: Dictionary to map class codes to class names, hierarchically organized as well.\n- `avail_taxonomies`: List of taxonomies available in `data/models.yaml`. Notice that this is a private attribute that \ncannot be directly edited by the user, although it can be updated after modifying the YAML files.\n\n## Caching\n\nBy default, models are downloaded on-the-fly at inference time. This means that whenever a model is used for the first\ntime, it will have to be downloaded first. In subsequent calls, inference will be faster since the model will already be\nloaded in memory. For a smooth and uninterrupted inference, it is possible to load all models at once (or only the ones\nbelonging to a specific taxonomy) with the function `cache_models()`. By doing this, the desired models will be cached\nin the directory specified by the class attribute `CACHE_DIR`, which can be modified if needed.\n\nFor instance, to classify models according to the IPC taxonomy, all IPC models can be loaded beforehand as follows:\n```python\nfrom clf_inference_intelcomp import Classification\n\nclf = Classification()\nclf.CACHE_DIR = \"~/Downloads/intelcomp/models\"\nclf.cache_models(\"ipc\")\nres = clf.classify(\"ipc\", \"This will not wait for model downloads.\")\n```\n\nSimilarly, in order to load ALL available models, no matter the taxonomy, simply call `clf.cache_models()` with no arguments.\n\n## Adding/Removing new taxonomies\n\nTo add and/or remove new taxonomies, the YAML files within the `data` directory must be updated accordingly.\nIt is important to always follow the original structure, which is showcased in the next section.\n\nThe Classification object will be initialized based on the YAML files content at creation time, but if such files are\nmodified afterwards, it is possible to update the taxonomy details by simply calling the `update_taxonomies()` method.\n\n```python\nfrom clf_inference_intelcomp import Classification\n\nclf = Classification()\nprint(clf.avail_taxonomies) # ['fos', 'ipc', 'nace2']\n# YAML FILES BEING MODIFIED...\nclf.update_taxonomies()\nprint(clf.avail_taxonomies) # ['fos', 'ipc', 'nace2', 'NEW_TAXONOMY']\n```\n\n## YAML files\nThe `data` directory contains 2 YAML files that can be edited by the user to add new taxonomies:\n### `models.yaml`: \nLists the models to be used for each taxonomy, with a different indentation for each of its levels. \\\nCan either contain the names of models from the HuggingFace Hub or paths to local repositories. \\\nThe first level (i.e. 0) of any taxonomy only contains a single model, which would be the root node of the \nclassification tree, while deeper levels have a list of models for each possible outcome from the prior level. \nThis is why, for the non-zero levels, it is important to make sure that the model key (value preceding \":\") \nmatches some label from the previous model in the chain.\n### `classes.yaml`: \nLists the possible outputs of all classifiers from `models.yaml`, following the same format. \\\nIt basically maps class codes to their corresponding names, for a better interpretability of the results.\n\n**Note**: \\\nIt is important to ensure that the class names (keys of inner dictionaries) are read as strings. \\\nThe reason is that some taxonomies (e.g. IPC) have numeric codes that are preceded by a number zero (e.g. \"01\"). \\\nIn these cases, YAML reads the key as an integer and, thus, the preceding zero gets lost.\nThe library makes sure to convert all keys to strings for consistency, but these specific cases with leading zeros \nare hard to handle since there is no way to tell if there was a zero or not in the original file. \\\nIn order to prevent that, please make sure to surround this kind of numeric codes with double quotes.\n\n### Examples\nBelow are the default YAML files, for reference. Notice how no quotation marks are required at all in `models.yaml`, \nsince none of the taxonomies contains \"problematic\" codes and the function used to process YAML files already reads all\nkeys as strings.\n```yaml\n# data/models.yaml\n\nfos:\n 0: intelcomp/fos_level0\nipc:\n 0: intelcomp/ipc_level0\n 1:\n A: intelcomp/ipc_level1_A\n B: intelcomp/ipc_level1_B\n C: intelcomp/ipc_level1_C\n D: intelcomp/ipc_level1_D\n E: intelcomp/ipc_level1_E\n F: intelcomp/ipc_level1_F\n G: intelcomp/ipc_level1_G\n H: intelcomp/ipc_level1_H\nnace2:\n 0: intelcomp/nace2_level0\n 1:\n 20: intelcomp/nace2_level1_20\n 25: intelcomp/nace2_level1_25\n 26: intelcomp/nace2_level1_26\n 27: intelcomp/nace2_level1_27\n 28: intelcomp/nace2_level1_28\n 29: intelcomp/nace2_level1_29\n 42: intelcomp/nace2_level1_42\n```\n```yaml\n#data/classes.yaml\n\nfos:\n 0:\n 0: ART\n 1: BIOLOGY\n 2: BUSINESS\n 3: CHEMISTRY\n 4: COMPUTER SCIENCE\n 5: ECONOMICS\n 6: ENGINEERING\n 7: ENVIRONMENTAL SCIENCE\n 8: GEOGRAPHY\n 9: GEOLOGY\n 10: HISTORY\n 11: MATERIALS SCIENCE\n 12: MATHEMATICS\n 13: MEDICINE\n 14: PHILOSOPHY\n 15: PHYSICS\n 16: POLITICAL SCIENCE\n 17: PSYCHOLOGY\n 18: SOCIOLOGY\nipc:\n 0:\n A: HUMAN NECESSITIES\n B: PERFORMING OPERATIONS, TRANSPORTING\n C: CHEMISTRY, METALLURGY\n D: TEXTILES, PAPER\n E: FIXED CONSTRUCTIONS\n F: MECHANICAL ENGINEERING, LIGHTING, HEATING, WEAPONS\n G: PHYSICS\n H: ELECTRICITY\n 1:\n A:\n \"01\": AGRICULTURE; FORESTRY; ANIMAL HUSBANDRY; HUNTING; TRAPPING; FISHING\n \"21\": BAKING; EQUIPMENT FOR MAKING OR PROCESSING DOUGHS; DOUGHS FOR BAKING [2006.01]\n \"22\": BUTCHERING; MEAT TREATMENT; PROCESSING POULTRY OR FISH\n \"23\": FOODS OR FOODSTUFFS; TREATMENT THEREOF, NOT COVERED BY OTHER CLASSES\n \"24\": TOBACCO; CIGARS; CIGARETTES; SIMULATED SMOKING DEVICES; SMOKERS' REQUISITES\n \"41\": WEARING APPAREL\n \"42\": HEADWEAR\n \"43\": FOOTWEAR\n \"44\": HABERDASHERY; JEWELLERY\n \"45\": HAND OR TRAVELLING ARTICLES\n \"46\": BRUSHWARE\n \"47\": FURNITURE; DOMESTIC ARTICLES OR APPLIANCES; COFFEE MILLS; SPICE MILLS; SUCTION CLEANERS IN GENERAL\n \"61\": MEDICAL OR VETERINARY SCIENCE; HYGIENE\n \"62\": LIFE-SAVING; FIRE-FIGHTING\n \"63\": SPORTS; GAMES; AMUSEMENTS\n \"99\": SUBJECT MATTER NOT OTHERWISE PROVIDED FOR IN THIS SECTION [2006.01]\n B:\n \"01\": PHYSICAL OR CHEMICAL PROCESSES OR APPARATUS IN GENERAL\n \"02\": CRUSHING, PULVERISING, OR DISINTEGRATING; PREPARATORY TREATMENT OF GRAIN FOR MILLING\n \"03\": SEPARATION OF SOLID MATERIALS USING LIQUIDS OR USING PNEUMATIC TABLES OR JIGS; MAGNETIC OR ELECTROSTATIC SEPARATION OF SOLID MATERIALS FROM SOLID MATERIALS OR FLUIDS; SEPARATION BY HIGH-VOLTAGE ELECTRIC FIELDS [5]\n \"04\": CENTRIFUGAL APPARATUS OR MACHINES FOR CARRYING-OUT PHYSICAL OR CHEMICAL PROCESSES\n \"05\": SPRAYING OR ATOMISING IN GENERAL; APPLYING FLUENT MATERIALS TO SURFACES, IN GENERAL [2]\n \"06\": GENERATING OR TRANSMITTING MECHANICAL VIBRATIONS IN GENERAL\n \"07\": SEPARATING SOLIDS FROM SOLIDS; SORTING\n \"08\": CLEANING\n \"09\": DISPOSAL OF SOLID WASTE; RECLAMATION OF CONTAMINATED SOIL [6]\n \"21\": MECHANICAL METAL-WORKING WITHOUT ESSENTIALLY REMOVING MATERIAL; PUNCHING METAL\n \"22\": CASTING; POWDER METALLURGY\n \"23\": MACHINE TOOLS; METAL-WORKING NOT OTHERWISE PROVIDED FOR\n \"24\": GRINDING; POLISHING\n \"25\": HAND TOOLS; PORTABLE POWER-DRIVEN TOOLS; HANDLES FOR HAND IMPLEMENTS; WORKSHOP EQUIPMENT; MANIPULATORS\n \"26\": HAND CUTTING TOOLS; CUTTING; SEVERING\n \"27\": WORKING OR PRESERVING WOOD OR SIMILAR MATERIAL; NAILING OR STAPLING MACHINES IN GENERAL\n \"28\": WORKING CEMENT, CLAY, OR STONE\n \"29\": WORKING OF PLASTICS; WORKING OF SUBSTANCES IN A PLASTIC STATE IN GENERAL\n \"30\": PRESSES\n \"31\": MAKING ARTICLES OF PAPER, CARDBOARD OR MATERIAL WORKED IN A MANNER ANALOGOUS TO PAPER; WORKING PAPER, CARDBOARD OR MATERIAL WORKED IN A MANNER ANALOGOUS TO PAPER\n \"32\": LAYERED PRODUCTS\n \"33\": ADDITIVE MANUFACTURING TECHNOLOGY [2015.01]\n \"41\": PRINTING; LINING MACHINES; TYPEWRITERS; STAMPS [4]\n \"42\": BOOKBINDING; ALBUMS; FILES; SPECIAL PRINTED MATTER\n \"43\": WRITING OR DRAWING IMPLEMENTS; BUREAU ACCESSORIES\n \"44\": DECORATIVE ARTS\n \"60\": VEHICLES IN GENERAL\n \"61\": RAILWAYS\n \"62\": LAND VEHICLES FOR TRAVELLING OTHERWISE THAN ON RAILS\n \"63\": SHIPS OR OTHER WATERBORNE VESSELS; RELATED EQUIPMENT\n \"64\": AIRCRAFT; AVIATION; COSMONAUTICS\n \"65\": CONVEYING; PACKING; STORING; HANDLING THIN OR FILAMENTARY MATERIAL\n \"66\": HOISTING; LIFTING; HAULING\n \"67\": OPENING OR CLOSING BOTTLES, JARS OR SIMILAR CONTAINERS; LIQUID HANDLING\n \"68\": SADDLERY; UPHOLSTERY\n \"81\": MICROSTRUCTURAL TECHNOLOGY [7]\n \"82\": NANOTECHNOLOGY [7]\n \"99\": SUBJECT MATTER NOT OTHERWISE PROVIDED FOR IN THIS SECTION [2006.01]\n C:\n \"01\": INORGANIC CHEMISTRY\n \"02\": TREATMENT OF WATER, WASTE WATER, SEWAGE, OR SLUDGE\n \"03\": GLASS; MINERAL OR SLAG WOOL\n \"04\": CEMENTS; CONCRETE; ARTIFICIAL STONE; CERAMICS; REFRACTORIES [4]\n \"05\": FERTILISERS; MANUFACTURE THEREOF [4]\n \"06\": EXPLOSIVES; MATCHES\n \"07\": ORGANIC CHEMISTRY [2]\n \"08\": ORGANIC MACROMOLECULAR COMPOUNDS; THEIR PREPARATION OR CHEMICAL WORKING-UP; COMPOSITIONS BASED THEREON\n \"09\": DYES; PAINTS; POLISHES; NATURAL RESINS; ADHESIVES; COMPOSITIONS NOT OTHERWISE PROVIDED FOR; APPLICATIONS OF MATERIALS NOT OTHERWISE PROVIDED FOR\n \"10\": PETROLEUM, GAS OR COKE INDUSTRIES; TECHNICAL GASES CONTAINING CARBON MONOXIDE; FUELS; LUBRICANTS; PEAT\n \"11\": ANIMAL OR VEGETABLE OILS, FATS, FATTY SUBSTANCES OR WAXES; FATTY ACIDS THEREFROM; DETERGENTS; CANDLES\n \"12\": BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING\n \"13\": SUGAR INDUSTRY [4]\n \"14\": SKINS; HIDES; PELTS OR LEATHER\n \"21\": METALLURGY OF IRON\n \"22\": METALLURGY; FERROUS OR NON-FERROUS ALLOYS; TREATMENT OF ALLOYS OR NON-FERROUS METALS\n \"23\": COATING METALLIC MATERIAL; COATING MATERIAL WITH METALLIC MATERIAL; CHEMICAL SURFACE TREATMENT; DIFFUSION TREATMENT OF METALLIC MATERIAL; COATING BY VACUUM EVAPORATION, BY SPUTTERING, BY ION IMPLANTATION OR BY CHEMICAL VAPOUR DEPOSITION, IN GENERAL; INHIBITING CORROSION OF METALLIC MATERIAL OR INCRUSTATION IN GENERAL [2]\n \"25\": ELECTROLYTIC OR ELECTROPHORETIC PROCESSES; APPARATUS THEREFOR [4]\n \"30\": CRYSTAL GROWTH [3]\n \"40\": COMBINATORIAL TECHNOLOGY [2006.01]\n \"99\": SUBJECT MATTER NOT OTHERWISE PROVIDED FOR IN THIS SECTION [2006.01]\n D:\n \"01\": NATURAL OR MAN-MADE THREADS OR FIBRES; SPINNING\n \"02\": YARNS; MECHANICAL FINISHING OF YARNS OR ROPES; WARPING OR BEAMING\n \"03\": WEAVING\n \"04\": BRAIDING; LACE-MAKING; KNITTING; TRIMMINGS; NON-WOVEN FABRICS\n \"05\": SEWING; EMBROIDERING; TUFTING\n \"06\": TREATMENT OF TEXTILES OR THE LIKE; LAUNDERING; FLEXIBLE MATERIALS NOT OTHERWISE PROVIDED FOR\n \"07\": ROPES; CABLES OTHER THAN ELECTRIC\n \"21\": PAPER-MAKING; PRODUCTION OF CELLULOSE\n \"99\": SUBJECT MATTER NOT OTHERWISE PROVIDED FOR IN THIS SECTION [2006.01]\n E:\n \"01\": CONSTRUCTION OF ROADS, RAILWAYS, OR BRIDGES\n \"02\": HYDRAULIC ENGINEERING; FOUNDATIONS; SOIL-SHIFTING\n \"03\": WATER SUPPLY; SEWERAGE\n \"04\": BUILDING\n \"05\": LOCKS; KEYS; WINDOW OR DOOR FITTINGS; SAFES\n \"06\": DOORS, WINDOWS, SHUTTERS, OR ROLLER BLINDS, IN GENERAL; LADDERS\n \"21\": EARTH OR ROCK DRILLING; MINING\n \"99\": SUBJECT MATTER NOT OTHERWISE PROVIDED FOR IN THIS SECTION [2006.01]\n F:\n \"01\": MACHINES OR ENGINES IN GENERAL; ENGINE PLANTS IN GENERAL; STEAM ENGINES\n \"02\": COMBUSTION ENGINES; HOT-GAS OR COMBUSTION-PRODUCT ENGINE PLANTS\n \"03\": MACHINES OR ENGINES FOR LIQUIDS; WIND, SPRING, OR WEIGHT MOTORS; PRODUCING MECHANICAL POWER OR A REACTIVE PROPULSIVE THRUST, NOT OTHERWISE PROVIDED FOR\n \"04\": POSITIVE-DISPLACEMENT MACHINES FOR LIQUIDS; PUMPS FOR LIQUIDS OR ELASTIC FLUIDS\n \"15\": FLUID-PRESSURE ACTUATORS; HYDRAULICS OR PNEUMATICS IN GENERAL\n \"16\": ENGINEERING ELEMENTS OR UNITS; GENERAL MEASURES FOR PRODUCING AND MAINTAINING EFFECTIVE FUNCTIONING OF MACHINES OR INSTALLATIONS; THERMAL INSULATION IN GENERAL\n \"17\": STORING OR DISTRIBUTING GASES OR LIQUIDS\n \"21\": LIGHTING\n \"22\": STEAM GENERATION\n \"23\": COMBUSTION APPARATUS; COMBUSTION PROCESSES\n \"24\": HEATING; RANGES; VENTILATING\n \"25\": REFRIGERATION OR COOLING; COMBINED HEATING AND REFRIGERATION SYSTEMS; HEAT PUMP SYSTEMS; MANUFACTURE OR STORAGE OF ICE; LIQUEFACTION OR SOLIDIFICATION OF GASES\n \"26\": DRYING\n \"27\": FURNACES; KILNS, OVENS OR RETORTS [4]\n \"28\": HEAT EXCHANGE IN GENERAL\n \"41\": WEAPONS\n \"42\": AMMUNITION; BLASTING\n \"99\": SUBJECT MATTER NOT OTHERWISE PROVIDED FOR IN THIS SECTION [2006.01]\n G:\n \"01\": MEASURING; TESTING\n \"02\": OPTICS\n \"03\": PHOTOGRAPHY; CINEMATOGRAPHY; ANALOGOUS TECHNIQUES USING WAVES OTHER THAN OPTICAL WAVES; ELECTROGRAPHY; HOLOGRAPHY [4]\n \"04\": HOROLOGY\n \"05\": CONTROLLING; REGULATING\n \"06\": COMPUTING; CALCULATING OR COUNTING\n \"07\": CHECKING-DEVICES\n \"08\": SIGNALLING\n \"09\": EDUCATING; CRYPTOGRAPHY; DISPLAY; ADVERTISING; SEALS\n \"10\": MUSICAL INSTRUMENTS; ACOUSTICS\n \"11\": INFORMATION STORAGE\n \"12\": INSTRUMENT DETAILS\n \"16\": INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS [2018.01]\n \"21\": NUCLEAR PHYSICS; NUCLEAR ENGINEERING\n \"99\": SUBJECT MATTER NOT OTHERWISE PROVIDED FOR IN THIS SECTION [2006.01]\n H:\n \"01\": BASIC ELECTRIC ELEMENTS\n \"02\": GENERATION, CONVERSION, OR DISTRIBUTION OF ELECTRIC POWER\n \"03\": BASIC ELECTRONIC CIRCUITRY\n \"04\": ELECTRIC COMMUNICATION TECHNIQUE\n \"05\": ELECTRIC TECHNIQUES NOT OTHERWISE PROVIDED FOR\n \"99\": SUBJECT MATTER NOT OTHERWISE PROVIDED FOR IN THIS SECTION [2006.01]\nnace2:\n 0:\n 10: MANUFACTURE OF FOOD PRODUCTS\n 11: MANUFACTURE OF BEVERAGES\n 12: MANUFACTURE OF TOBACCO PRODUCTS\n 13: MANUFACTURE OF TEXTILES\n 14: MANUFACTURE OF WEARING APPAREL\n 15: MANUFACTURE OF LEATHER AND RELATED PRODUCTS\n 16: MANUFACTURE OF WOOD AND OF PRODUCTS OF WOOD AND CORK, EXCEPT FURNITURE; MANUFACTURE OF ARTICLES OF STRAW AND PLAITING MATERIALS\n 17: MANUFACTURE OF PAPER AND PAPER PRODUCTS\n 18: PRINTING AND REPRODUCTION OF RECORDED MEDIA\n 19: MANUFACTURE OF COKE AND REFINED PETROLEUM PRODUCTS\n 20: MANUFACTURE OF CHEMICALS AND CHEMICAL PRODUCTS\n 21: MANUFACTURE OF BASIC PHARMACEUTICAL PRODUCTS AND PHARMACEUTICAL PREPARATIONS\n 22: MANUFACTURE OF RUBBER AND PLASTIC PRODUCTS\n 23: MANUFACTURE OF OTHER NON-METALLIC MINERAL PRODUCTS\n 24: MANUFACTURE OF BASIC METALS\n 25: MANUFACTURE OF FABRICATED METAL PRODUCTS, EXCEPT MACHINERY AND EQUIPMENT\n 26: MANUFACTURE OF COMPUTER, ELECTRONIC AND OPTICAL PRODUCTS\n 27: MANUFACTURE OF ELECTRICAL EQUIPMENT\n 28: MANUFACTURE OF MACHINERY AND EQUIPMENT N.E.C.\n 29: MANUFACTURE OF MOTOR VEHICLES, TRAILERS AND SEMI-TRAILERS\n 30: MANUFACTURE OF OTHER TRANSPORT EQUIPMENT\n 31: MANUFACTURE OF FURNITURE\n 32: OTHER MANUFACTURING\n 42: CIVIL ENGINEERING\n 43: SPECIALISED CONSTRUCTION ACTIVITIES\n 62: COMPUTER PROGRAMMING, CONSULTANCY AND RELATED ACTIVITIES\n 1:\n 20:\n 1: MANUFACTURE OF BASIC CHEMICALS, FERTILISERS AND NITROGEN COMPOUNDS, PLASTICS AND SYNTHETIC RUBBER IN PRIMARY FORMS\n 2: MANUFACTURE OF PESTICIDES AND OTHER AGROCHEMICAL PRODUCTS\n 3: MANUFACTURE OF PAINTS, VARNISHES AND SIMILAR COATINGS, PRINTING INK AND MASTICS\n 4: MANUFACTURE OF SOAP AND DETERGENTS, CLEANING AND POLISHING PREPARATIONS, PERFUMES AND TOILET PREPARATIONS\n 5: MANUFACTURE OF OTHER CHEMICAL PRODUCTS\n 6: MANUFACTURE OF MAN-MADE FIBRES\n 25:\n 1: MANUFACTURE OF STRUCTURAL METAL PRODUCTS\n 2: MANUFACTURE OF TANKS, RESERVOIRS AND CONTAINERS OF METAL\n 3: MANUFACTURE OF STEAM GENERATORS, EXCEPT CENTRAL HEATING HOT WATER BOILERS\n 4: MANUFACTURE OF WEAPONS AND AMMUNITION\n 5: FORGING, PRESSING, STAMPING AND ROLL-FORMING OF METAL; POWDER METALLURGY\n 6: TREATMENT AND COATING OF METALS; MACHINING\n 7: MANUFACTURE OF CUTLERY, TOOLS AND GENERAL HARDWARE\n 9: MANUFACTURE OF OTHER FABRICATED METAL PRODUCTS\n 26:\n 1: MANUFACTURE OF ELECTRONIC COMPONENTS AND BOARDS\n 2: MANUFACTURE OF COMPUTERS AND PERIPHERAL EQUIPMENT\n 3: MANUFACTURE OF COMPUTERS AND PERIPHERAL EQUIPMENT\n 4: MANUFACTURE OF CONSUMER ELECTRONICS\n 5: MANUFACTURE OF INSTRUMENTS AND APPLIANCES FOR MEASURING, TESTING AND NAVIGATION; WATCHES AND CLOCKS\n 6: MANUFACTURE OF IRRADIATION, ELECTROMEDICAL AND ELECTROTHERAPEUTIC EQUIPMENT\n 7: MANUFACTURE OF OPTICAL INSTRUMENTS AND PHOTOGRAPHIC EQUIPMENT\n 8: MANUFACTURE OF MAGNETIC AND OPTICAL MEDIA\n 27:\n 1: MANUFACTURE OF ELECTRIC MOTORS, GENERATORS, TRANSFORMERS AND ELECTRICITY DISTRIBUTION AND CONTROL APPARATUS\n 2: MANUFACTURE OF BATTERIES AND ACCUMULATORS\n 3: MANUFACTURE OF WIRING AND WIRING DEVICES\n 4: MANUFACTURE OF ELECTRIC LIGHTING EQUIPMENT\n 5: MANUFACTURE OF DOMESTIC APPLIANCES\n 9: MANUFACTURE OF OTHER ELECTRICAL EQUIPMENT\n 28:\n 1: MANUFACTURE OF GENERAL \u2014 PURPOSE MACHINERY\n 2: MANUFACTURE OF OTHER GENERAL-PURPOSE MACHINERY\n 3: MANUFACTURE OF AGRICULTURAL AND FORESTRY MACHINERY\n 4: MANUFACTURE OF METAL FORMING MACHINERY AND MACHINE TOOLS\n 9: MANUFACTURE OF OTHER SPECIAL-PURPOSE MACHINERY\n 29:\n 1: MANUFACTURE OF MOTOR VEHICLES\n 3: MANUFACTURE OF PARTS AND ACCESSORIES FOR MOTOR VEHICLES\n 42:\n 2: CONSTRUCTION OF UTILITY PROJECTS\n 9: CONSTRUCTION OF OTHER CIVIL ENGINEERING PROJECTS\n```\n\n\n",
"bugtrack_url": null,
"license": "MIT",
"summary": "Python package to perform inference using Intelcomp's hierarchical text classifiers.",
"version": "0.1.6",
"project_urls": {
"Homepage": "https://github.com/IntelCompH2020"
},
"split_keywords": [
"intelcomp",
"nlp",
"text classification",
"hierarchical classification",
"taxonomy learning"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "bcb6386a335a020814a4ad4bc9c34fe3a0cdc19467ecc01ffee238adcdc9594a",
"md5": "76c104db536ad94d0383da53fdf685a1",
"sha256": "32814ded4224a52bcd88d94c698afe055ca31d59dcb0805fda407cd97626c60b"
},
"downloads": -1,
"filename": "clf_inference_intelcomp-0.1.6-py3-none-any.whl",
"has_sig": false,
"md5_digest": "76c104db536ad94d0383da53fdf685a1",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": null,
"size": 18036,
"upload_time": "2023-09-06T10:09:00",
"upload_time_iso_8601": "2023-09-06T10:09:00.980875Z",
"url": "https://files.pythonhosted.org/packages/bc/b6/386a335a020814a4ad4bc9c34fe3a0cdc19467ecc01ffee238adcdc9594a/clf_inference_intelcomp-0.1.6-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "79e6f25fe54f98688c985b9acdb14a109ef2214fdbceab59b6ed35defd59da7e",
"md5": "73b93cd8ce9c03dbf377aee67fa0ca21",
"sha256": "c74599ca10f89e75bcb24ed21d5699eb43fd4624a3df7a9500778d0d06e54137"
},
"downloads": -1,
"filename": "clf_inference_intelcomp-0.1.6.tar.gz",
"has_sig": false,
"md5_digest": "73b93cd8ce9c03dbf377aee67fa0ca21",
"packagetype": "sdist",
"python_version": "source",
"requires_python": null,
"size": 17259,
"upload_time": "2023-09-06T10:09:02",
"upload_time_iso_8601": "2023-09-06T10:09:02.477713Z",
"url": "https://files.pythonhosted.org/packages/79/e6/f25fe54f98688c985b9acdb14a109ef2214fdbceab59b6ed35defd59da7e/clf_inference_intelcomp-0.1.6.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2023-09-06 10:09:02",
"github": false,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"lcname": "clf-inference-intelcomp"
}