firstname-to-nationality


Namefirstname-to-nationality JSON
Version 1.0.0 PyPI version JSON
download
home_pagehttps://github.com/callidio/firstname_to_nationality
SummaryNationality Prediction from Firstname using Python 3.13 and scikit-learn
upload_time2025-11-07 06:34:49
maintainerNone
docs_urlNone
authorFirstname to Nationality Team
requires_python>=3.11
licenseMIT
keywords firstname nationality prediction names machine-learning nlp
VCS
bugtrack_url
requirements numpy scikit-learn joblib pandas matplotlib seaborn pytest black isort pylint mypy types-requests
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # Firstname to Nationality - Python 3.13 Implementation

A name-to-nationality prediction library for Python 3.13+ using machine learning libraries.

## ๐Ÿš€ Features

This library provides the following capabilities:

- โœ… **Python 3.13+ Compatible**: Uses Python features and type hints
- โœ… **ML Stack**: Built with scikit-learn for performance and compatibility
- โœ… **Type Safety**: Full type hints and dataclasses throughout
- โœ… **Error Handling**: Robust error handling and fallbacks
- โœ… **Dev Container Ready**: Includes VS Code dev container configuration
- โœ… **Flexible Training**: Easy model training with your own data
- โœ… **Batch Processing**: Efficient batch prediction support

## ๐Ÿ“ฆ Installation

### Using the Dev Container (Recommended)

1. Open in VS Code
2. When prompted, click "Reopen in Container"
3. The dev container will build automatically with Python 3.13

### Manual Installation

```bash
# Ensure you have Python 3.13+
python --version

# Install dependencies
pip install -r requirements.txt

# Install the package
pip install -e .
```

## ๐Ÿ”ง Quick Start

```python
from firstname_to_nationality import FirstnameToNationality

# Initialize the predictor
predictor = FirstnameToNationality()

# Predict nationality for a single name
result = predictor.predict_single("Giuseppe Rossi", top_n=3)
print(result)  # [('Italian', 0.85), ('Spanish', 0.12), ...]

# Batch prediction
names = ["John Smith", "Maria Rodriguez", "Zhang Wei"]
results = predictor(names, top_n=2)

for name, predictions in results:
    nationality, confidence = predictions[0]
    print(f"{name} โ†’ {nationality} ({confidence:.2f})")
```

## ๐Ÿงช Examples

Run the example script:

```bash
python example.py
```

## ๐Ÿ”ฅ Training Your Own Model

### Using Sample Data

```bash
python nationality_trainer.py
```

### Using Your Own Data

Create a CSV file with `name` and `nationality` columns:

```csv
name,nationality
John Smith,American
Giuseppe Rossi,Italian
Hiroshi Tanaka,Japanese
```

Then train:

```bash
python nationality_trainer.py your_data.csv
```

### Creating a Dictionary

```bash
python nationality_trainer.py --dict
```

## ๐Ÿ—๏ธ Architecture

The implementation consists of:

- **`FirstnameToNationality`**: Main predictor class with scikit-learn backend  
- **`NamePreprocessor`**: Advanced name preprocessing and normalization
- **`PredictionResult`**: Type-safe prediction results using dataclasses
- **Model Pipeline**: TF-IDF vectorization + Logistic Regression

## ๐Ÿ“ File Structure

The implementation uses these file paths:

- `firstname_to_nationality/best-model.pt`: Model checkpoint file
- `firstname_to_nationality/firstname_nationalities.pkl`: Name-to-nationality dictionary

## ๏ฟฝ Usage Examples

### Basic Usage
```python
from firstname_to_nationality import FirstnameToNationality
predictor = FirstnameToNationality()
results = predictor(["John Smith"])
```

### Advanced Features
```python
# Type-safe single predictions
result = predictor.predict_single("John Smith", top_n=3)

# Training interface
predictor.train(names, nationalities, save_model=True)

# Dictionary management
predictor.save_dictionary(name_dict)
```

## ๐Ÿณ Development with Docker

### Dev Container
The repository includes a complete dev container setup for VS Code:

```bash
# Open in VS Code
code .
# Click "Reopen in Container" when prompted
```

### Manual Docker
```bash
# Build
docker build -f .devcontainer/Dockerfile -t firstname-to-nationality .

# Run
docker run -it --rm -v $(pwd):/workspace firstname-to-nationality
```

## โšก Performance

The implementation offers:

- Fast training with scikit-learn
- Memory efficiency
- Batch processing support
- Python optimizations

## ๐Ÿงฌ Dependencies

**Core Requirements:**
- Python 3.13+
- scikit-learn >= 1.3.0
- numpy >= 1.25.0
- pandas >= 2.0.0
- joblib >= 1.3.0

**Development:**
- pytest, black, isort, pylint, mypy

## ๐Ÿค Contributing

1. Use the dev container for consistent environment
2. Follow type hints throughout
3. Run tests: `pytest`
4. Format code: `black . && isort .`
5. Check types: `mypy firstname_to_nationality/`

## ๐Ÿ“„ License

MIT License

## ๏ฟฝ Implementation Details

This is a complete implementation with:

- โœ… Consistent method signatures
- โœ… Reliable file handling
- โœ… Robust prediction results
- โœ… Efficient model format
- โœ… Minimal dependencies

## ๐ŸŽฏ Roadmap

- [ ] Transformer-based models support
- [ ] REST API server
- [ ] Web interface
- [ ] Multi-language support
- [ ] Advanced evaluation metrics

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/callidio/firstname_to_nationality",
    "name": "firstname-to-nationality",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.11",
    "maintainer_email": null,
    "keywords": "firstname nationality prediction names machine-learning nlp",
    "author": "Firstname to Nationality Team",
    "author_email": null,
    "download_url": "https://files.pythonhosted.org/packages/41/cc/8ce49461960c6c10e73edf7173bac1290b45a768ac1db7957919a1a75752/firstname_to_nationality-1.0.0.tar.gz",
    "platform": null,
    "description": "# Firstname to Nationality - Python 3.13 Implementation\n\nA name-to-nationality prediction library for Python 3.13+ using machine learning libraries.\n\n## \ud83d\ude80 Features\n\nThis library provides the following capabilities:\n\n- \u2705 **Python 3.13+ Compatible**: Uses Python features and type hints\n- \u2705 **ML Stack**: Built with scikit-learn for performance and compatibility\n- \u2705 **Type Safety**: Full type hints and dataclasses throughout\n- \u2705 **Error Handling**: Robust error handling and fallbacks\n- \u2705 **Dev Container Ready**: Includes VS Code dev container configuration\n- \u2705 **Flexible Training**: Easy model training with your own data\n- \u2705 **Batch Processing**: Efficient batch prediction support\n\n## \ud83d\udce6 Installation\n\n### Using the Dev Container (Recommended)\n\n1. Open in VS Code\n2. When prompted, click \"Reopen in Container\"\n3. The dev container will build automatically with Python 3.13\n\n### Manual Installation\n\n```bash\n# Ensure you have Python 3.13+\npython --version\n\n# Install dependencies\npip install -r requirements.txt\n\n# Install the package\npip install -e .\n```\n\n## \ud83d\udd27 Quick Start\n\n```python\nfrom firstname_to_nationality import FirstnameToNationality\n\n# Initialize the predictor\npredictor = FirstnameToNationality()\n\n# Predict nationality for a single name\nresult = predictor.predict_single(\"Giuseppe Rossi\", top_n=3)\nprint(result)  # [('Italian', 0.85), ('Spanish', 0.12), ...]\n\n# Batch prediction\nnames = [\"John Smith\", \"Maria Rodriguez\", \"Zhang Wei\"]\nresults = predictor(names, top_n=2)\n\nfor name, predictions in results:\n    nationality, confidence = predictions[0]\n    print(f\"{name} \u2192 {nationality} ({confidence:.2f})\")\n```\n\n## \ud83e\uddea Examples\n\nRun the example script:\n\n```bash\npython example.py\n```\n\n## \ud83d\udd25 Training Your Own Model\n\n### Using Sample Data\n\n```bash\npython nationality_trainer.py\n```\n\n### Using Your Own Data\n\nCreate a CSV file with `name` and `nationality` columns:\n\n```csv\nname,nationality\nJohn Smith,American\nGiuseppe Rossi,Italian\nHiroshi Tanaka,Japanese\n```\n\nThen train:\n\n```bash\npython nationality_trainer.py your_data.csv\n```\n\n### Creating a Dictionary\n\n```bash\npython nationality_trainer.py --dict\n```\n\n## \ud83c\udfd7\ufe0f Architecture\n\nThe implementation consists of:\n\n- **`FirstnameToNationality`**: Main predictor class with scikit-learn backend  \n- **`NamePreprocessor`**: Advanced name preprocessing and normalization\n- **`PredictionResult`**: Type-safe prediction results using dataclasses\n- **Model Pipeline**: TF-IDF vectorization + Logistic Regression\n\n## \ud83d\udcc1 File Structure\n\nThe implementation uses these file paths:\n\n- `firstname_to_nationality/best-model.pt`: Model checkpoint file\n- `firstname_to_nationality/firstname_nationalities.pkl`: Name-to-nationality dictionary\n\n## \ufffd Usage Examples\n\n### Basic Usage\n```python\nfrom firstname_to_nationality import FirstnameToNationality\npredictor = FirstnameToNationality()\nresults = predictor([\"John Smith\"])\n```\n\n### Advanced Features\n```python\n# Type-safe single predictions\nresult = predictor.predict_single(\"John Smith\", top_n=3)\n\n# Training interface\npredictor.train(names, nationalities, save_model=True)\n\n# Dictionary management\npredictor.save_dictionary(name_dict)\n```\n\n## \ud83d\udc33 Development with Docker\n\n### Dev Container\nThe repository includes a complete dev container setup for VS Code:\n\n```bash\n# Open in VS Code\ncode .\n# Click \"Reopen in Container\" when prompted\n```\n\n### Manual Docker\n```bash\n# Build\ndocker build -f .devcontainer/Dockerfile -t firstname-to-nationality .\n\n# Run\ndocker run -it --rm -v $(pwd):/workspace firstname-to-nationality\n```\n\n## \u26a1 Performance\n\nThe implementation offers:\n\n- Fast training with scikit-learn\n- Memory efficiency\n- Batch processing support\n- Python optimizations\n\n## \ud83e\uddec Dependencies\n\n**Core Requirements:**\n- Python 3.13+\n- scikit-learn >= 1.3.0\n- numpy >= 1.25.0\n- pandas >= 2.0.0\n- joblib >= 1.3.0\n\n**Development:**\n- pytest, black, isort, pylint, mypy\n\n## \ud83e\udd1d Contributing\n\n1. Use the dev container for consistent environment\n2. Follow type hints throughout\n3. Run tests: `pytest`\n4. Format code: `black . && isort .`\n5. Check types: `mypy firstname_to_nationality/`\n\n## \ud83d\udcc4 License\n\nMIT License\n\n## \ufffd Implementation Details\n\nThis is a complete implementation with:\n\n- \u2705 Consistent method signatures\n- \u2705 Reliable file handling\n- \u2705 Robust prediction results\n- \u2705 Efficient model format\n- \u2705 Minimal dependencies\n\n## \ud83c\udfaf Roadmap\n\n- [ ] Transformer-based models support\n- [ ] REST API server\n- [ ] Web interface\n- [ ] Multi-language support\n- [ ] Advanced evaluation metrics\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "Nationality Prediction from Firstname using Python 3.13 and scikit-learn",
    "version": "1.0.0",
    "project_urls": {
        "Documentation": "https://github.com/callidio/firstname_to_nationality#readme",
        "Homepage": "https://github.com/callidio/firstname_to_nationality",
        "Source": "https://github.com/callidio/firstname_to_nationality",
        "Tracker": "https://github.com/callidio/firstname_to_nationality/issues"
    },
    "split_keywords": [
        "firstname",
        "nationality",
        "prediction",
        "names",
        "machine-learning",
        "nlp"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "93856671ffa40b70bae35034a343364d347ee9543de847e685f039a57cd13dd2",
                "md5": "254f4f44c664f72da723e578e80bf10f",
                "sha256": "639d1963c27456bc6cc404421e6e40c482d1f78c0ab2840834cc0381eaa2d35b"
            },
            "downloads": -1,
            "filename": "firstname_to_nationality-1.0.0-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "254f4f44c664f72da723e578e80bf10f",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.11",
            "size": 27244601,
            "upload_time": "2025-11-07T06:34:46",
            "upload_time_iso_8601": "2025-11-07T06:34:46.124777Z",
            "url": "https://files.pythonhosted.org/packages/93/85/6671ffa40b70bae35034a343364d347ee9543de847e685f039a57cd13dd2/firstname_to_nationality-1.0.0-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "41cc8ce49461960c6c10e73edf7173bac1290b45a768ac1db7957919a1a75752",
                "md5": "0a356d254478707a80946d7fb2fe46ca",
                "sha256": "3afbad07311e00db2d3b1ddf50f909e3ca83ddc828f36d6f8a066076c028e991"
            },
            "downloads": -1,
            "filename": "firstname_to_nationality-1.0.0.tar.gz",
            "has_sig": false,
            "md5_digest": "0a356d254478707a80946d7fb2fe46ca",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.11",
            "size": 27008265,
            "upload_time": "2025-11-07T06:34:49",
            "upload_time_iso_8601": "2025-11-07T06:34:49.870457Z",
            "url": "https://files.pythonhosted.org/packages/41/cc/8ce49461960c6c10e73edf7173bac1290b45a768ac1db7957919a1a75752/firstname_to_nationality-1.0.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-11-07 06:34:49",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "callidio",
    "github_project": "firstname_to_nationality",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "requirements": [
        {
            "name": "numpy",
            "specs": [
                [
                    ">=",
                    "1.25.0"
                ]
            ]
        },
        {
            "name": "scikit-learn",
            "specs": [
                [
                    ">=",
                    "1.3.0"
                ]
            ]
        },
        {
            "name": "joblib",
            "specs": [
                [
                    ">=",
                    "1.3.0"
                ]
            ]
        },
        {
            "name": "pandas",
            "specs": [
                [
                    ">=",
                    "2.0.0"
                ]
            ]
        },
        {
            "name": "matplotlib",
            "specs": [
                [
                    ">=",
                    "3.7.0"
                ]
            ]
        },
        {
            "name": "seaborn",
            "specs": [
                [
                    ">=",
                    "0.12.0"
                ]
            ]
        },
        {
            "name": "pytest",
            "specs": [
                [
                    ">=",
                    "7.4.0"
                ]
            ]
        },
        {
            "name": "black",
            "specs": [
                [
                    ">=",
                    "23.0.0"
                ]
            ]
        },
        {
            "name": "isort",
            "specs": [
                [
                    ">=",
                    "5.12.0"
                ]
            ]
        },
        {
            "name": "pylint",
            "specs": [
                [
                    ">=",
                    "2.17.0"
                ]
            ]
        },
        {
            "name": "mypy",
            "specs": [
                [
                    ">=",
                    "1.5.0"
                ]
            ]
        },
        {
            "name": "types-requests",
            "specs": []
        }
    ],
    "lcname": "firstname-to-nationality"
}
        
Elapsed time: 1.18656s