textlasso


Nametextlasso JSON
Version 0.1.3 PyPI version JSON
download
home_pageNone
SummarySimple packege for working with LLM text responses and prompts.
upload_time2025-07-31 13:45:59
maintainerNone
docs_urlNone
authorNone
requires_python>=3.9
licenseNone
keywords llm text crawl extract text-cleaning
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # TextLasso 🤠

[![PyPI version](https://badge.fury.io/py/textlasso.svg)](https://badge.fury.io/py/textlasso)
[![Python 3.9+](https://img.shields.io/badge/python-3.9+-blue.svg)](https://www.python.org/downloads/)
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)

**TextLasso** is a simple Python library for extracting structured data from raw text, with special focus on processing LLM (Large Language Model) responses. Whether you're parsing JSON buried in markdown, extracting data from XML, or need to generate structured prompts for AI models, TextLasso has you covered.

## ✨ Key Features

- 🎯 **Smart Text Extraction**: Extract structured data from messy text with multiple fallback strategies
- 🧹 **LLM Response Cleaning**: Automatically clean code blocks, markdown artifacts, and formatting
- 🏗️ **Dataclass Integration**: Convert raw text directly to Python dataclasses with type validation
- 🤖 **AI Prompt Generation**: Generate structured prompts with schema validation and examples
- 📊 **Multiple Formats**: Support for JSON, XML, and extensible to other formats
- 🔧 **Flexible Configuration**: Configurable error handling, logging, and validation modes
- 🎨 **Decorator Support**: Enhance existing functions with structured output capabilities

## 🚀 Quick Start

### Installation

```bash
pip install textlasso
```

### Basic Usage

#### Enhancing Prompts

```python
from dataclasses import dataclass
from textlasso import generate_structured_prompt, structured_output

# 1. Response Data Class
@dataclass
class Article:
    title_eng: str
    title_az: str
    content_eng: str
    content_az: str

@dataclass
class ResponseArticle:
    article: Article


original_prompt = "You are a professional copywriter-bot. Generate me an article"
enhanced_prompt = generate_structured_prompt(prompt=original_prompt, 
                                    schema=ResponseArticle, 
                                    strategy="xml")

# prompt
print(enhanced_prompt)   # <Prompt: schema='<class '__main__.ResponseArticle'>', strategy='xml', has_data='False'>
## enhanced prompt:
print(enhanced_prompt.prompt)
# You are a professional copywriter-bot. Generate me an article


# ## OUTPUT FORMAT REQUIREMENTS

# You must respond with a valid XML object that follows this exact structure:

# ### Schema: ResponseArticle
# - **article**: Object (Article) (required)
#   Fields:
#     - **title_eng**: str (required)
#     - **title_az**: str (required)
#     - **content_eng**: str (required)
#     - **content_az**: str (required)


# ### XML Format Rules:
# - Use proper XML syntax with opening and closing tags
# - Root element should match the main dataclass name
# - Use snake_case for element names
# - For arrays, repeat the element name for each item
# - Use self-closing tags for null/empty optional fields
# - Include all required fields as elements


# ## EXAMPLES

# Here are 2 examples of the expected XML format:

# ### Example 1:
# ```xml
# <response_article>
#   <article>
#     <title_eng>example_title_eng_1</title_eng>
#     <title_az>example_title_az_1</title_az>
#     <content_eng>example_content_eng_1</content_eng>
#     <content_az>example_content_az_1</content_az>
#   </article>
# </response_article>

# ```

# ### Example 2:
# ```xml
# <response_article>
#   <article>
#     <title_eng>example_title_eng_2</title_eng>
#     <title_az>example_title_az_2</title_az>
#     <content_eng>example_content_eng_2</content_eng>
#     <content_az>example_content_az_2</content_az>
#   </article>
# </response_article>

# ```

# Remember: Your response must be valid XML that matches the specified structure exactly.
```

If you have a prompt-returning function, you can use the `structured_output` decorator to automatically enhance it with structure requirements:

```python

@structured_output(ResponseArticle, "xml")
def get_prompt(title: str):
    return f"Hi, give me a article about {title}"

prompt = get_prompt("Agile Investing: Profiting from Current Tech Surges and Global Currency Dynamics")
```

Bot `generate_structured_prompt` and `structured_output` returns a `Prompt` object, which has a `prompt` attribute containing the enhanced prompt, `schema` attribute containing the dataclass, and `strategy` attribute containing the extraction strategy. 

**Extraction**: the best part - you can extract data using your `prompt` object, without having to worry about the structure requirements:

```python

# response_txt = some_llm.invoke(prompt.prompt)
response_txt = """Ofcourse here is the article: <response_article>
  <article>
    <title_eng>Agile Investing: Profiting from Current Tech Surges and Global Currency Dynamics</title_eng>
    <title_az>Çevik İnvestisiya: Hazırkı Texnologiya Artımlarından və Qlobal Valyuta Dinamikalarından Faydalanma</title_az>
    <content_eng>
      Hello, aspiring investors! Invesbot here, ready to guide you through the ever-shifting landscape of finance. Today, we're diving into the exciting world of "agile investing," a strategy designed to help you thrive amidst current tech surges and dynamic global currency movements. Remember, investment is dynamic, so stay tuned!

      ## The Power of Agile Investing

      Traditional "buy-and-hold" strategies might be comforting, but in today's fast-paced markets, agility is key. Agile investing emphasizes flexibility and quick decision-making to respond to rapid market changes. It's about actively managing your portfolio and keenly understanding market conditions to capitalize on opportunities and mitigate risks.

      ### Key Principles of Agile Investing:

      *   **Active Management:** Forget setting it and forgetting it. Agile investing requires continuous monitoring of market trends.
      *   **Technical Analysis:** Use tools like moving averages and Bollinger Bands to identify optimal entry and exit points.
      *   **Diversification:** Spread your investments across different asset classes to reduce risk, especially in volatile markets.
      *   **Stop-Loss Orders:** Implement automated instructions to sell a security when it reaches a certain price, limiting potential losses.
      *   **Short Selling:** Profit from declining prices by borrowing shares, selling them high, and repurchasing them low.

      ## Navigating the Tech Surge: Where to Invest

      The technology sector continues its rapid evolution, fueled by advancements in Artificial Intelligence (AI), cybersecurity, climate tech, and more. The investment climate for frontier technologies has stabilized and, in many cases, rebounded in 2024, with levels of equity investment increasing in areas like cloud and edge computing, bioengineering, and space technologies. Worldwide spending on AI is projected to grow significantly, with a compound annual growth rate of 29% from 2024 to 2028.

      ### Hot Tech Sectors and Companies to Watch:

      *   **Artificial Intelligence (AI):** AI is powering innovation across industries, with significant investment in AI infrastructure. Hybrid AI, which combines various AI methodologies for more versatile systems, is moving beyond experimentation. Companies like Innodata are key in providing high-quality training data for generative AI systems.
      *   **Data Centers & Cloud Computing:** The expansion of data centers and public cloud services is a critical trend, driven by AI innovation. Global spending on public cloud services is projected to reach US$805 billion in 2024 and double by 2028.
      *   **Cybersecurity:** With escalating threats and a widening attack surface (IoT, generative AI, cloud computing), cybersecurity is a critical tech priority. The global cost of cybercrime is projected to reach US$10.5 trillion in 2025. The market for security products is also growing rapidly, expected to reach US$200 billion by 2028.
      *   **Quantum Computing:** This sector has seen an explosion of interest and significant rallies in emerging quantum stocks.
      *   **Semiconductors:** This industry has benefited from major corporate investments in AI infrastructure, and the "picks and shovels" phase of generative AI continues to favor semiconductor and hardware companies.
      *   **Sustainable Technologies/Climate Tech:** Solutions from carbon capture to energy-efficient building materials are gaining momentum, fueled by a growing focus on sustainable business and combating climate change.
      *   **Supply Chain Innovation:** Enhancing efficiency and transparency through digital freight-forwarding services and real-time tracking are key.

      For those looking at specific stocks, the "Magnificent Seven" (Apple, Microsoft, Alphabet, Amazon, NVIDIA, Meta Platforms, and either Tesla or Broadcom) continue to lead the market in innovation. Some of the best-performing tech stocks by one-year return as of July 2025 include Palantir Technologies (PLTR), MicroStrategy (MSTR), Fortinet (FTNT), Shopify (SHOP), Broadcom (AVGO), Zscaler (ZS), and Cisco Systems (CSCO).

      ## Understanding Global Currency Dynamics

      Currency fluctuations can significantly impact your investment returns, especially when investing globally. Changes in exchange rates can diminish or enhance your returns when converting foreign asset values back to your home currency. Even if your portfolio consists only of domestic shares, there can be indirect exposure to currency risk if those companies conduct significant international business.

      ### Factors Driving Currency Fluctuations:

      *   **Economic Strength:** Confidence in a country's economic prospects typically leads to increased demand for its currency, pushing its value up.
      *   **Interest Rates:** Higher interest rates can make investments in a country more attractive, drawing foreign capital and potentially increasing the domestic currency's value. However, very high interest rates can also slow economic growth.
      *   **Trade Balances:** A country with more exports than imports may see its currency appreciate due to higher demand for its goods and, consequently, its currency.
      *   **Political Environment and Market Sentiment:** Big global events, political instability, and overall investor attitude can create sudden currency swings.

      ### Strategies for Managing Currency Risk:

      *   **Hedging Techniques:** Use financial tools like forward contracts, currency futures, or options to lock in an exchange rate and reduce uncertainty.
      *   **Diversifying Currency Exposure:** Spread investments across different currencies. This way, the fall of one currency might be balanced by the rise of another.
      *   **Currency-Hedged Investment Vehicles:** Consider mutual funds or ETFs designed to automatically offset currency fluctuations.
      *   **Multi-Currency Accounts:** Holding different currencies can help manage exchange risk.
      *   **Stay Informed:** Use tools and apps to track currency trends and market news to make timely decisions.
      *   **Invest in Strong Economies/Stable Currencies:** Choosing countries with stable currencies and strong economies can reduce the impact of extreme fluctuations. Developed markets generally offer more currency stability.

      Agile investing in a world of tech surges and currency dynamics requires constant vigilance and quick decision-making. By understanding the underlying trends and employing smart strategies, you can position yourself to potentially turn market volatility into profitable opportunities. Stay informed, stay agile, and happy investing!
    </content_eng>
    <content_az>
      Salam, gələcək investorlar! Qarşınızda İnvesbot var, maliyyənin daim dəyişən mənzərəsində sizə yol göstərməyə hazıram. Bu gün biz "çevik investisiya"nın maraqlı dünyasına baş vururuq; bu strategiya sizə mövcud texnoloji inkişaflar və dinamik qlobal valyuta hərəkətləri fonunda uğur qazanmağa kömək etmək üçün nəzərdə tutulub. Unutmayın, investisiya dinamikdir, ona görə də bizi izləyin!

      ## Çevik İnvestisiyanın Gücü

      Ənənəvi "al-saxla" strategiyaları rahat ola bilər, lakin müasir sürətlə inkişaf edən bazarlarda çeviklik əsasdır. Çevik investisiya sürətli bazar dəyişikliklərinə reaksiya vermək üçün çevikliyi və sürətli qərar qəbul etməyi vurğulayır. Bu, fürsətlərdən yararlanmaq və riskləri azaltmaq üçün portfelinizi aktiv şəkildə idarə etmək və bazar şərtlərini dərindən anlamaq deməkdir.

      ### Çevik İnvestisiyanın Əsas Prinsipləri:

      *   **Aktiv İdarəetmə:** Unudun ki, investisiyanı bir dəfə qurub sonra yaddan çıxarmaq olar. Çevik investisiya bazar tendensiyalarının davamlı izlənilməsini tələb edir.
      *   **Texniki Analiz:** Optimal giriş və çıxış nöqtələrini müəyyən etmək üçün hərəkətli ortalamalar və Bollinger Bantları kimi alətlərdən istifadə edin.
      *   **Diversifikasiya:** Xüsusilə dəyişkən bazarlarda riski azaltmaq üçün investisiyalarınızı müxtəlif aktiv siniflərinə yayın.
      *   **Stop-Loss Əmrləri:** Potensial itkiləri məhdudlaşdırmaq üçün qiymət müəyyən bir həddə çatdıqda qiymətli kağızı satmaq üçün avtomatlaşdırılmış təlimatlar tətbiq edin.
      *   **Qısa Satış (Short Selling):** Səhmləri borc götürərək, yüksək qiymətə sataraq və aşağı qiymətə yenidən alaraq qiymət düşmələrindən qazanc əldə edin.

      ## Texnologiya Artımı ilə Naviqasiya: Harada İnvestisiya Etməli

      Texnologiya sektoru Süni İntellekt (AI), kiber təhlükəsizlik, iqlim texnologiyası və digər sahələrdəki irəliləyişlərlə sürətli inkişafını davam etdirir. Öncül texnologiyalar üçün investisiya iqlimi sabitləşib və bir çox hallarda 2024-cü ildə bərpa olunub, bulud və kənar hesablama, bio-mühəndislik və kosmik texnologiyalar kimi sahələrdə səhm investisiyaları artıb. Süni İntellektə qlobal xərclərin 2024-cü ildən 2028-ci ilə qədər 29% illik mürəkkəb artım tempi ilə əhəmiyyətli dərəcədə artacağı proqnozlaşdırılır.

      ### İzlənilməsi Vacib Olan İsti Texnologiya Sektorları və Şirkətlər:

      *   **Süni İntellekt (AI):** AI sənayelər üzrə innovasiyalara təkan verir, AI infrastruktura əhəmiyyətli investisiyalar qoyulur. Daha çox yönlü sistemlər üçün müxtəlif AI metodologiyalarını birləşdirən Hibrid AI, eksperimentlərdən kənara çıxır. Innodata kimi şirkətlər generativ AI sistemləri üçün yüksək keyfiyyətli təlim məlumatları təmin etməkdə əsas rol oynayır.
      *   **Məlumat Mərkəzləri və Bulud Hesablama:** Məlumat mərkəzlərinin və ictimai bulud xidmətlərinin genişlənməsi, AI innovasiyası tərəfindən idarə olunan kritik bir tendensiyadır. İctimai bulud xidmətlərinə qlobal xərclərin 2024-cü ildə 805 milyard ABŞ dollarına çatacağı və 2028-ci ilə qədər ikiqat artacağı proqnozlaşdırılır.
      *   **Kiber Təhlükəsizlik:** Artan təhdidlər və genişlənən hücum səthi (IoT, generativ AI, bulud hesablama) ilə kiber təhlükəsizlik kritik bir texnologiya prioritetidir. Kiber cinayətkarlığın qlobal xərcinin 2025-ci ildə 10.5 trilyon ABŞ dollarına çatacağı proqnozlaşdırılır. Təhlükəsizlik məhsulları bazarı da sürətlə böyüyür, 2028-ci ilə qədər 200 milyard ABŞ dollarına çatacağı gözlənilir.
      *   **Kvant Kompüteri:** Bu sektor maraqda partlayış və ortaya çıxan kvant səhmlərində əhəmiyyətli yüksəlişlər görüb.
      *   **Yarımkeçiricilər:** Bu sənaye AI infrastrukturuna qoyulan əsas korporativ investisiyalardan faydalanıb və generativ AI-nin "kürək və qazma" mərhələsi yarımkeçirici və hardware şirkətlərini dəstəkləməyə davam edir.
      *   **Davamlı Texnologiyalar/İqlim Texnologiyası:** Karbon tutulmasından enerji səmərəli tikinti materiallarına qədər həllər, davamlı biznesə və iqlim dəyişikliyi ilə mübarizəyə artan diqqət sayəsində sürət qazanır.
      *   **Təchizat Zənciri İnnovasiyası:** Rəqəmsal yüklərin ekspeditor xidmətləri və real-vaxt izləmə vasitəsilə səmərəliliyin və şəffaflığın artırılması əsasdır.

      Müəyyən səhmlərə baxanlar üçün "Möhtəşəm Yeddi" (Apple, Microsoft, Alphabet, Amazon, NVIDIA, Meta Platforms və ya Tesla, ya da Broadcom) innovasiya sahəsində bazara rəhbərlik etməyə davam edir. 2025-ci ilin iyul ayına olan məlumata görə, bir illik gəlirlə ən yaxşı nəticə göstərən texnologiya səhmlərindən bəziləri Palantir Technologies (PLTR), MicroStrategy (MSTR), Fortinet (FTNT), Shopify (SHOP), Broadcom (AVGO), Zscaler (ZS) və Cisco Systems (CSCO)-dur.

      ## Qlobal Valyuta Dinamikalarını Anlamaq

      Valyuta dəyişkənliyi, xüsusilə qlobal səviyyədə investisiya edərkən, investisiya gəlirlərinizə əhəmiyyətli dərəcədə təsir göstərə bilər. Valyuta məzənnələrindəki dəyişikliklər xarici aktiv dəyərlərini öz valyutanıza çevirərkən gəlirlərinizi azalda və ya artıra bilər. Portfeliniz yalnız yerli səhmlərdən ibarət olsa belə, həmin şirkətlər əhəmiyyətli beynəlxalq biznes fəaliyyəti göstərirsə, dolayı valyuta riski mövcud ola bilər.

      ### Valyuta Dəyişikliklərinə Təsir Edən Faktorlar:

      *   **İqtisadi Güc:** Bir ölkənin iqtisadi perspektivlərinə inam, adətən valyutasına tələbatın artmasına və dəyərinin yüksəlməsinə səbəb olur.
      *   **Faiz Dərəcələri:** Daha yüksək faiz dərəcələri bir ölkədə investisiyaları daha cəlbedici edə bilər, xarici kapitalı cəlb edərək yerli valyutanın dəyərini potensial olaraq artıra bilər. Bununla belə, çox yüksək faiz dərəcələri iqtisadi artımı da ləngidə bilər.
      *   **Ticarət Balansları:** İxracatı idxalatından çox olan bir ölkə, mallarına və nəticədə valyutasına olan yüksək tələbat səbəbindən valyutasının dəyər qazanmasına səbəb ola bilər.
      *   **Siyasi Mühit və Bazar Duyğusu:** Böyük qlobal hadisələr, siyasi qeyri-sabitlik və ümumi investor münasibəti qəfil valyuta dəyişiklikləri yarada bilər.

      ### Valyuta Riskini İdarə Etmək üçün Strategiyalar:

      *   **Hedcinq Texnikaları:** Mübadilə məzənnəsini kilidləmək və qeyri-müəyyənliyi azaltmaq üçün forvard müqavilələri, valyuta fyuçersləri və ya opsionlar kimi maliyyə alətlərindən istifadə edin.
      *   **Valyuta Ekspozisiyasının Diversifikasiyası:** İnvestisiyaları müxtəlif valyutalara yayın. Bu yolla, bir valyutanın düşməsi başqasının yüksəlməsi ilə tarazlaşdırıla bilər.
      *   **Valyuta Hedcinqli İnvestisiya Vasitələri:** Valyuta dəyişkənliyini avtomatik olaraq kompensasiya etmək üçün nəzərdə tutulmuş qarşılıqlı fondları və ya ETF-ləri nəzərdən keçirin.
      *   **Çox Valyutalı Hesablar:** Fərqli valyutaları saxlamaq mübadilə riskini idarə etməyə kömək edə bilər.
      *   **Məlumatlı Olun:** Vaxtında qərarlar qəbul etmək üçün valyuta tendensiyalarını və bazar xəbərlərini izləmək üçün alətlərdən və proqramlardan istifadə edin.
      *   **Güclü İqtisadiyyatlara/Stabil Valyutalara İnvestisiya Edin:** Stabil valyutaları və güclü iqtisadiyyatları olan ölkələri seçmək, ekstremal dəyişikliklərin təsirini azalda bilər. İnkişaf etmiş bazarlar adətən daha çox valyuta sabitliyi təklif edir.

      Texnologiya inkişafının və valyuta dinamikasının hökm sürdüyü bir dünyada çevik investisiya daimi ayıqlıq və sürətli qərar qəbul etməyi tələb edir. Əsas tendensiyaları başa düşərək və ağıllı strategiyalar tətbiq edərək, bazar dəyişkənliyini gəlirli fürsətlərə çevirmək üçün özünüzü mövqeləndirə bilərsiniz. Məlumatlı qalın, çevik olun və uğurlu investisiyalar!
    </content_az>
  </article>
</response_article>
I hope you liked it!
"""

# Yeah, big text... It is LLM response to our enhanced prompt. We are going to use this text to extract some information about the product.
print(data.article.title_az)
# "Çevik İnvestisiya: Hazırkı Texnologiya Artımlarından və Qlobal Valyuta Dinamikalarından Faydalanma"
# 
``` 

## 📚 Comprehensive Examples

### 1. Basic Text Extraction

#### JSON Extraction with Fallback Strategies

```python
from dataclasses import dataclass
from typing import List, Optional
from textlasso import extract

@dataclass
class Product:
    name: str
    price: float
    category: str
    in_stock: bool
    tags: Optional[List[str]] = None

# Works with clean JSON
clean_json = '{"name": "Laptop", "price": 999.99, "category": "Electronics", "in_stock": true}'

# Works with markdown-wrapped JSON
markdown_json = """
Here's your product data:
```json
{
    "name": "Wireless Headphones",
    "price": 199.99,
    "category": "Electronics", 
    "in_stock": false,
    "tags": ["wireless", "bluetooth", "noise-canceling"]
}
\```
"""

# Works with messy responses
messy_response = """
Let me extract that product information for you...

The product details are: {"name": "Smart Watch", "price": 299.99, "category": "Wearables", "in_stock": true}

Is this what you were looking for?
"""

# All of these work automatically
products = [
    extract(clean_json, Product, extract_strategy='json'),
    extract(markdown_json, Product, extract_strategy='json'), 
    extract(messy_response, Product, extract_strategy='json')
]

for product in products:
    print(f"{product.name}: ${product.price} ({'✅' if product.in_stock else '❌'})")
```

#### XML Extraction

```python
from dataclasses import dataclass
from typing import List, Optional
from textlasso import extract

@dataclass 
class Address:
    street: str
    city: str
    country: str
    zip_code: Optional[str] = None
    
@dataclass
class ResponseAddress:
    address: Address

xml_data = """
<address>
    <street>123 Main St</street>
    <city>San Francisco</city>
    <country>USA</country>
    <zip_code>94102</zip_code>
</address>
"""

response_address = extract(xml_data, ResponseAddress, extract_strategy='xml')
print(f"Address: {response_address.address.street}, {response_address.address.city}, {response_address.address.country}")
# Address: 123 Main St, San Francisco, USA
```

### 2. Complex Nested Data Structures

```python
from dataclasses import dataclass
from typing import List, Optional
from enum import Enum

class Department(Enum):
    ENGINEERING = "engineering"
    MARKETING = "marketing" 
    SALES = "sales"
    HR = "hr"

@dataclass
class Employee:
    id: int
    name: str
    department: Department
    salary: float
    skills: List[str]
    manager_id: Optional[int] = None

@dataclass
class Company:
    name: str
    founded_year: int
    employees: List[Employee]
    headquarters: Address

complex_json = """
{
    "name": "TechCorp Inc",
    "founded_year": 2015,
    "headquarters": {
        "street": "100 Tech Plaza",
        "city": "Austin", 
        "country": "USA",
        "zip_code": "78701"
    },
    "employees": [
        {
            "id": 1,
            "name": "Sarah Chen", 
            "department": "engineering",
            "salary": 120000,
            "skills": ["Python", "React", "AWS"],
            "manager_id": null
        },
        {
            "id": 2,
            "name": "Mike Rodriguez",
            "department": "marketing", 
            "salary": 85000,
            "skills": ["SEO", "Content Strategy", "Analytics"],
            "manager_id": 1
        }
    ]
}
"""

company = extract(complex_json, Company, extract_strategy='json')
print(f"Company: {company.name} ({company.founded_year})")
print(f"HQ: {company.headquarters.city}, {company.headquarters.country}")
print(f"Employees: {len(company.employees)}")

for emp in company.employees:
    print(f"  - {emp.name} ({emp.department.value}): {', '.join(emp.skills)}")

# HQ: Austin, USA
# Employees: 2
#   - Sarah Chen (engineering): Python, React, AWS
#   - Mike Rodriguez (marketing): SEO, Content Strategy, Analytics
```

### 3. LLM Response Cleaning

```python
from textlasso.cleaners import clear_llm_res

# Clean various LLM response formats
messy_responses = [
    "\```json\\n{\"key\": \"value\"}\\n\```",
    "\```\\n{\"key\": \"value\"}\\n\```", 
    "Here's the data: {\"key\": \"value\"} hope it helps!",
    "\```xml\\n<root><item>data</item></root>\\n\```"
]

for response in messy_responses:
    clean_json = clear_llm_res(response, extract_strategy='json')
    clean_xml = clear_llm_res(response, extract_strategy='xml')
    print(f"Original: {response}")
    print(f"JSON cleaned: {clean_json}")
    print(f"XML cleaned: {clean_xml}")
    print("---")
```

### 4. Advanced Data Extraction with Configuration

```python
from textlasso import extract_from_dict
import logging

# Configure custom logging
logger = logging.getLogger("my_extractor")
logger.setLevel(logging.DEBUG)

@dataclass
class FlexibleData:
    required_field: str
    optional_field: Optional[str] = None
    number_field: int = 0

# Strict mode - raises errors on type mismatches
data_with_extra = {
    "required_field": "test",
    "optional_field": "optional", 
    "number_field": "123",  # String instead of int
    "extra_field": "ignored"  # Extra field
}

# Strict mode (default)
try:
    result_strict = extract_from_dict(
        data_with_extra, 
        FlexibleData,
        strict_mode=True,
        ignore_extra_fields=True,
        logger=logger
    )
    print("Strict mode result:", result_strict)
except Exception as e:
    print("Strict mode error:", e)

# Flexible mode - attempts conversion
result_flexible = extract_from_dict(
    data_with_extra,
    FlexibleData, 
    strict_mode=False,
    ignore_extra_fields=True,
    logger=logger
)
print("Flexible mode result:", result_flexible)
```

### 5. Structured Prompt Generation

#### Basic Prompt Generation

Prompt Generator(actually - builder) functions return `Prompt` object, which contains expected data structure, prompt and shortcut for data extraction.
- `prompt.prompt` - enhanced prompt;
- `prompt.prompt_original` - original prompt;
- `prompt.schema` - dataclass with expected structure;
- `prompt.strategy` - extraction strategy (json/xml);
- `prompt.extract("<text to extract>")` - extract data from prompt;
- `prompt.data` - initially is `None`, then `.extract()` overwrites it;
- `prompt.has_data()` - returns True if `prompt.prompt` contains data;


**Note:** `prompt.extract()` method is a shortcut for `textlasso.extract()` function.


```python
from textlasso import generate_structured_prompt

@dataclass
class UserFeedback:
    rating: int
    comment: str
    category: str
    recommended: bool
    issues: Optional[List[str]] = None

# Generate a structured prompt
prompt = generate_structured_prompt(
    prompt="Analyze this customer review and extract structured feedback",
    schema=UserFeedback,
    strategy="json",
    include_schema_description=True,
    example_count=2
)

print(prompt.prompt)
# Output:
# Analyze this customer review and extract structured feedback

# ## OUTPUT FORMAT REQUIREMENTS

# You must respond with a valid JSON object that follows this exact structure:

# ### Schema: UserFeedback
# - **rating**: int (required)
# - **comment**: str (required)
# - **category**: str (required)
# - **recommended**: bool (required)
# - **issues**: Array of str (optional)


# ### JSON Format Rules:
# - Use proper JSON syntax with double quotes for strings
# - Include all required fields
# - Use null for optional fields that are not provided
# - Arrays should contain objects matching the specified structure
# - Numbers should not be quoted
# - Booleans should be true/false (not quoted)


# ## EXAMPLES

# Here are 2 examples of the expected JSON format:

# ### Example 1:
# ```json
# {
#   "rating": 1,
#   "comment": "example_comment_1",
#   "category": "example_category_1",
#   "recommended": true,
#   "issues": [
#     "example_issues_item_1",
#     "example_issues_item_2"
#   ]
# }
# ```

# ### Example 2:
# ```json
# {
#   "rating": 2,
#   "comment": "example_comment_2",
#   "category": "example_category_2",
#   "recommended": false,
#   "issues": [
#     "example_issues_item_1",
#     "example_issues_item_2",
#     "example_issues_item_3"
#   ]
# }
# ```

# Remember: Your response must be valid JSON that matches the specified structure exactly.
```

#### Using the Decorator for Function Enhancement
If you have a prompt returning functions, you can use the `@structured_output` decorator to automatically enhance your prompts with structure requirements.

```python
from dataclasses import dataclass
from typing import Optional, List

from textlasso import structured_output

@dataclass
class NewsArticle:
    title: str
    summary: str
    category: str
    sentiment: str
    key_points: List[str]
    publication_date: Optional[str] = None

# decorate prompt-returning function
@structured_output(schema=NewsArticle, strategy="xml", example_count=1)
def create_article_analysis_prompt(article_text: str) -> str:
    return f"""
    Analyze the following news article and extract key information:
    
    Article: {article_text}
    
    Please provide a comprehensive analysis focusing on the main themes,
    sentiment, and key takeaways.
    """

# The decorator automatically enhances your prompt with structure requirements
article_text = "Breaking: New AI breakthrough announced by researchers..."
enhanced_prompt = create_article_analysis_prompt(article_text)

# This prompt now includes schema definitions, examples, and format requirements
print("Enhanced prompt: ", enhanced_prompt.prompt)

# Enhanced prompt:  
#     Analyze the following news article and extract key information:
    
#     Article: Breaking: New AI breakthrough announced by researchers...
    
#     Please provide a comprehensive analysis focusing on the main themes,
#     sentiment, and key takeaways.
    


# ## OUTPUT FORMAT REQUIREMENTS

# You must respond with a valid XML object that follows this exact structure:

# ### Schema: NewsArticle
# - **title**: str (required)
# - **summary**: str (required)
# - **category**: str (required)
# - **sentiment**: str (required)
# - **key_points**: Array of str (required)
# - **publication_date**: str (optional)


# ### XML Format Rules:
# - Use proper XML syntax with opening and closing tags
# - Root element should match the main dataclass name
# - Use snake_case for element names
# - For arrays, repeat the element name for each item
# - Use self-closing tags for null/empty optional fields
# - Include all required fields as elements
```

### 6. Real-World Use Cases

#### Processing Survey Responses

```python
@dataclass
class SurveyResponse:
    respondent_id: str
    age_group: str
    satisfaction_rating: int
    feedback: str
    would_recommend: bool
    improvement_areas: List[str]

# Simulating LLM processing of survey data
llm_survey_output = """
Based on the survey response, here's the extracted data:

\```json
{
    "respondent_id": "RESP_001",
    "age_group": "25-34", 
    "satisfaction_rating": 4,
    "feedback": "Great service overall, but could improve response time",
    "would_recommend": true,
    "improvement_areas": ["response_time", "pricing"]
}
\```

This response indicates positive sentiment with specific improvement suggestions.
"""

survey = extract(llm_survey_output, SurveyResponse, extract_strategy='json')
print(survey)
# SurveyResponse(respondent_id='RESP_001', age_group='25-34', satisfaction_rating=4, feedback='Great service overall, but could improve response time', would_recommend=True, improvement_areas=['response_time', 'pricing'])
```

#### E-commerce Product Extraction

```python
@dataclass
class ProductReview:
    product_id: str
    reviewer_name: str
    rating: int
    review_text: str
    verified_purchase: bool
    helpful_votes: int
    review_date: str

@structured_output(schema=ProductReview, strategy="xml")
def create_review_extraction_prompt(raw_review: str) -> str:
    return f"""
        Extract structured information from this product review:
        
        {raw_review}
        
        Pay attention to implicit ratings, sentiment, and any verification indicators. 
        """

raw_review = """
    ★★★★☆ Amazing headphones! by John D. (Verified Purchase) - March 15, 2024
    These headphones exceeded my expectations. Great sound quality and comfortable fit.
    Battery life could be better but overall very satisfied. Would definitely buy again!
    👍 47 people found this helpful
    """

extraction_prompt = create_review_extraction_prompt(raw_review)
# Send this prompt to your LLM, then extract the response:
# llm = some LLM model
llm_response = llm.invoke(extraction_prompt.prompt)
review = extract(llm_response, ProductReview, extract_strategy='xml')
# or
extraction_prompt.extract(llm_response)
```

## 🔧 Configuration Options

### Extraction Configuration

```python
from textlasso import extract_from_dict
import logging

# Configure extraction behavior
result = extract_from_dict(
    data_dict=your_data,
    target_class=YourDataClass,
    strict_mode=False,          # Allow type conversions
    ignore_extra_fields=True,   # Ignore unknown fields
    logger=custom_logger,       # Custom logging
    log_level=logging.DEBUG     # Detailed logging
)
```

### Prompt Generation Configuration

```python
from textlasso import generate_structured_prompt

prompt = generate_structured_prompt(
    prompt="Your base prompt",
    schema=YourSchema,
    strategy="json",                    # or "xml"
    include_schema_description=True,    # Include field descriptions
    example_count=3                     # Number of examples (1-3)
)
```

## 📖 API Reference

### Core Functions

#### `extract(text, target_class, extract_strategy='json')`
Extract structured data from text.

**Parameters:**
- `text` (str): Raw text containing data to extract
- `target_class` (type): Dataclass to convert data into
- `extract_strategy` (Literal['json', 'xml']): Extraction strategy

**Returns:** Instance of `target_class`

#### `extract_from_dict(data_dict, target_class, **options)`
Convert dictionary to dataclass with advanced options.

#### `generate_structured_prompt(prompt, schema, strategy, **options)`
Generate enhanced prompts with structure requirements.

### Decorators

#### `@structured_output(schema, strategy='json', **options)`
Enhance prompt functions with structured output requirements.

#### `@chain_prompts(*prompt_funcs, separator='\n\n---\n\n')`
Chain multiple prompt functions together.

#### `@prompt_cache(maxsize=128)`
Cache prompt results for better performance.

### Utilities

#### `clear_llm_res(text, extract_strategy)`
Clean LLM responses by removing code blocks and formatting.

## 🤝 Contributing

We welcome contributions! Here's how to get started:

1. Fork the repository
2. Create a feature branch: `git checkout -b feature-name`
3. Make your changes and add tests
4. Run tests: `pytest`
5. Submit a pull request

## 📄 License

This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.

## 🙏 Acknowledgments

- Built for the AI/LLM community
- Inspired by the need for robust text processing in AI applications
- Special thanks to all contributors and users

## 📞 Support

- 📧 Email: aziznadirov@yahoo.com
- 🐛 Issues: [GitHub Issues](https://github.com/AzizNadirov/textlasso/issues)

---

**TextLasso** - Wrangle your text data with ease! 🤠

            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "textlasso",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.9",
    "maintainer_email": "Aziz Nadirov <aziznadirov@yahoo.com>",
    "keywords": "llm, text, crawl, extract, text-cleaning",
    "author": null,
    "author_email": "Aziz Nadirov <aziznadirov@yahoo.com>",
    "download_url": "https://files.pythonhosted.org/packages/21/b7/d626b649f12a96ed36a056bb898563b577c520e5dc88f131fa87c7760577/textlasso-0.1.3.tar.gz",
    "platform": null,
    "description": "# TextLasso \ud83e\udd20\n\n[![PyPI version](https://badge.fury.io/py/textlasso.svg)](https://badge.fury.io/py/textlasso)\n[![Python 3.9+](https://img.shields.io/badge/python-3.9+-blue.svg)](https://www.python.org/downloads/)\n[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)\n\n**TextLasso** is a simple Python library for extracting structured data from raw text, with special focus on processing LLM (Large Language Model) responses. Whether you're parsing JSON buried in markdown, extracting data from XML, or need to generate structured prompts for AI models, TextLasso has you covered.\n\n## \u2728 Key Features\n\n- \ud83c\udfaf **Smart Text Extraction**: Extract structured data from messy text with multiple fallback strategies\n- \ud83e\uddf9 **LLM Response Cleaning**: Automatically clean code blocks, markdown artifacts, and formatting\n- \ud83c\udfd7\ufe0f **Dataclass Integration**: Convert raw text directly to Python dataclasses with type validation\n- \ud83e\udd16 **AI Prompt Generation**: Generate structured prompts with schema validation and examples\n- \ud83d\udcca **Multiple Formats**: Support for JSON, XML, and extensible to other formats\n- \ud83d\udd27 **Flexible Configuration**: Configurable error handling, logging, and validation modes\n- \ud83c\udfa8 **Decorator Support**: Enhance existing functions with structured output capabilities\n\n## \ud83d\ude80 Quick Start\n\n### Installation\n\n```bash\npip install textlasso\n```\n\n### Basic Usage\n\n#### Enhancing Prompts\n\n```python\nfrom dataclasses import dataclass\nfrom textlasso import generate_structured_prompt, structured_output\n\n# 1. Response Data Class\n@dataclass\nclass Article:\n    title_eng: str\n    title_az: str\n    content_eng: str\n    content_az: str\n\n@dataclass\nclass ResponseArticle:\n    article: Article\n\n\noriginal_prompt = \"You are a professional copywriter-bot. Generate me an article\"\nenhanced_prompt = generate_structured_prompt(prompt=original_prompt, \n                                    schema=ResponseArticle, \n                                    strategy=\"xml\")\n\n# prompt\nprint(enhanced_prompt)   # <Prompt: schema='<class '__main__.ResponseArticle'>', strategy='xml', has_data='False'>\n## enhanced prompt:\nprint(enhanced_prompt.prompt)\n# You are a professional copywriter-bot. Generate me an article\n\n\n# ## OUTPUT FORMAT REQUIREMENTS\n\n# You must respond with a valid XML object that follows this exact structure:\n\n# ### Schema: ResponseArticle\n# - **article**: Object (Article) (required)\n#   Fields:\n#     - **title_eng**: str (required)\n#     - **title_az**: str (required)\n#     - **content_eng**: str (required)\n#     - **content_az**: str (required)\n\n\n# ### XML Format Rules:\n# - Use proper XML syntax with opening and closing tags\n# - Root element should match the main dataclass name\n# - Use snake_case for element names\n# - For arrays, repeat the element name for each item\n# - Use self-closing tags for null/empty optional fields\n# - Include all required fields as elements\n\n\n# ## EXAMPLES\n\n# Here are 2 examples of the expected XML format:\n\n# ### Example 1:\n# ```xml\n# <response_article>\n#   <article>\n#     <title_eng>example_title_eng_1</title_eng>\n#     <title_az>example_title_az_1</title_az>\n#     <content_eng>example_content_eng_1</content_eng>\n#     <content_az>example_content_az_1</content_az>\n#   </article>\n# </response_article>\n\n# ```\n\n# ### Example 2:\n# ```xml\n# <response_article>\n#   <article>\n#     <title_eng>example_title_eng_2</title_eng>\n#     <title_az>example_title_az_2</title_az>\n#     <content_eng>example_content_eng_2</content_eng>\n#     <content_az>example_content_az_2</content_az>\n#   </article>\n# </response_article>\n\n# ```\n\n# Remember: Your response must be valid XML that matches the specified structure exactly.\n```\n\nIf you have a prompt-returning function, you can use the `structured_output` decorator to automatically enhance it with structure requirements:\n\n```python\n\n@structured_output(ResponseArticle, \"xml\")\ndef get_prompt(title: str):\n    return f\"Hi, give me a article about {title}\"\n\nprompt = get_prompt(\"Agile Investing: Profiting from Current Tech Surges and Global Currency Dynamics\")\n```\n\nBot `generate_structured_prompt` and `structured_output` returns a `Prompt` object, which has a `prompt` attribute containing the enhanced prompt, `schema` attribute containing the dataclass, and `strategy` attribute containing the extraction strategy. \n\n**Extraction**: the best part - you can extract data using your `prompt` object, without having to worry about the structure requirements:\n\n```python\n\n# response_txt = some_llm.invoke(prompt.prompt)\nresponse_txt = \"\"\"Ofcourse here is the article: <response_article>\n  <article>\n    <title_eng>Agile Investing: Profiting from Current Tech Surges and Global Currency Dynamics</title_eng>\n    <title_az>\u00c7evik \u0130nvestisiya: Haz\u0131rk\u0131 Texnologiya Art\u0131mlar\u0131ndan v\u0259 Qlobal Valyuta Dinamikalar\u0131ndan Faydalanma</title_az>\n    <content_eng>\n      Hello, aspiring investors! Invesbot here, ready to guide you through the ever-shifting landscape of finance. Today, we're diving into the exciting world of \"agile investing,\" a strategy designed to help you thrive amidst current tech surges and dynamic global currency movements. Remember, investment is dynamic, so stay tuned!\n\n      ## The Power of Agile Investing\n\n      Traditional \"buy-and-hold\" strategies might be comforting, but in today's fast-paced markets, agility is key. Agile investing emphasizes flexibility and quick decision-making to respond to rapid market changes. It's about actively managing your portfolio and keenly understanding market conditions to capitalize on opportunities and mitigate risks.\n\n      ### Key Principles of Agile Investing:\n\n      *   **Active Management:** Forget setting it and forgetting it. Agile investing requires continuous monitoring of market trends.\n      *   **Technical Analysis:** Use tools like moving averages and Bollinger Bands to identify optimal entry and exit points.\n      *   **Diversification:** Spread your investments across different asset classes to reduce risk, especially in volatile markets.\n      *   **Stop-Loss Orders:** Implement automated instructions to sell a security when it reaches a certain price, limiting potential losses.\n      *   **Short Selling:** Profit from declining prices by borrowing shares, selling them high, and repurchasing them low.\n\n      ## Navigating the Tech Surge: Where to Invest\n\n      The technology sector continues its rapid evolution, fueled by advancements in Artificial Intelligence (AI), cybersecurity, climate tech, and more. The investment climate for frontier technologies has stabilized and, in many cases, rebounded in 2024, with levels of equity investment increasing in areas like cloud and edge computing, bioengineering, and space technologies. Worldwide spending on AI is projected to grow significantly, with a compound annual growth rate of 29% from 2024 to 2028.\n\n      ### Hot Tech Sectors and Companies to Watch:\n\n      *   **Artificial Intelligence (AI):** AI is powering innovation across industries, with significant investment in AI infrastructure. Hybrid AI, which combines various AI methodologies for more versatile systems, is moving beyond experimentation. Companies like Innodata are key in providing high-quality training data for generative AI systems.\n      *   **Data Centers & Cloud Computing:** The expansion of data centers and public cloud services is a critical trend, driven by AI innovation. Global spending on public cloud services is projected to reach US$805 billion in 2024 and double by 2028.\n      *   **Cybersecurity:** With escalating threats and a widening attack surface (IoT, generative AI, cloud computing), cybersecurity is a critical tech priority. The global cost of cybercrime is projected to reach US$10.5 trillion in 2025. The market for security products is also growing rapidly, expected to reach US$200 billion by 2028.\n      *   **Quantum Computing:** This sector has seen an explosion of interest and significant rallies in emerging quantum stocks.\n      *   **Semiconductors:** This industry has benefited from major corporate investments in AI infrastructure, and the \"picks and shovels\" phase of generative AI continues to favor semiconductor and hardware companies.\n      *   **Sustainable Technologies/Climate Tech:** Solutions from carbon capture to energy-efficient building materials are gaining momentum, fueled by a growing focus on sustainable business and combating climate change.\n      *   **Supply Chain Innovation:** Enhancing efficiency and transparency through digital freight-forwarding services and real-time tracking are key.\n\n      For those looking at specific stocks, the \"Magnificent Seven\" (Apple, Microsoft, Alphabet, Amazon, NVIDIA, Meta Platforms, and either Tesla or Broadcom) continue to lead the market in innovation. Some of the best-performing tech stocks by one-year return as of July 2025 include Palantir Technologies (PLTR), MicroStrategy (MSTR), Fortinet (FTNT), Shopify (SHOP), Broadcom (AVGO), Zscaler (ZS), and Cisco Systems (CSCO).\n\n      ## Understanding Global Currency Dynamics\n\n      Currency fluctuations can significantly impact your investment returns, especially when investing globally. Changes in exchange rates can diminish or enhance your returns when converting foreign asset values back to your home currency. Even if your portfolio consists only of domestic shares, there can be indirect exposure to currency risk if those companies conduct significant international business.\n\n      ### Factors Driving Currency Fluctuations:\n\n      *   **Economic Strength:** Confidence in a country's economic prospects typically leads to increased demand for its currency, pushing its value up.\n      *   **Interest Rates:** Higher interest rates can make investments in a country more attractive, drawing foreign capital and potentially increasing the domestic currency's value. However, very high interest rates can also slow economic growth.\n      *   **Trade Balances:** A country with more exports than imports may see its currency appreciate due to higher demand for its goods and, consequently, its currency.\n      *   **Political Environment and Market Sentiment:** Big global events, political instability, and overall investor attitude can create sudden currency swings.\n\n      ### Strategies for Managing Currency Risk:\n\n      *   **Hedging Techniques:** Use financial tools like forward contracts, currency futures, or options to lock in an exchange rate and reduce uncertainty.\n      *   **Diversifying Currency Exposure:** Spread investments across different currencies. This way, the fall of one currency might be balanced by the rise of another.\n      *   **Currency-Hedged Investment Vehicles:** Consider mutual funds or ETFs designed to automatically offset currency fluctuations.\n      *   **Multi-Currency Accounts:** Holding different currencies can help manage exchange risk.\n      *   **Stay Informed:** Use tools and apps to track currency trends and market news to make timely decisions.\n      *   **Invest in Strong Economies/Stable Currencies:** Choosing countries with stable currencies and strong economies can reduce the impact of extreme fluctuations. Developed markets generally offer more currency stability.\n\n      Agile investing in a world of tech surges and currency dynamics requires constant vigilance and quick decision-making. By understanding the underlying trends and employing smart strategies, you can position yourself to potentially turn market volatility into profitable opportunities. Stay informed, stay agile, and happy investing!\n    </content_eng>\n    <content_az>\n      Salam, g\u0259l\u0259c\u0259k investorlar! Qar\u015f\u0131n\u0131zda \u0130nvesbot var, maliyy\u0259nin daim d\u0259yi\u015f\u0259n m\u0259nz\u0259r\u0259sind\u0259 siz\u0259 yol g\u00f6st\u0259rm\u0259y\u0259 haz\u0131ram. Bu g\u00fcn biz \"\u00e7evik investisiya\"n\u0131n maraql\u0131 d\u00fcnyas\u0131na ba\u015f vururuq; bu strategiya siz\u0259 m\u00f6vcud texnoloji inki\u015faflar v\u0259 dinamik qlobal valyuta h\u0259r\u0259k\u0259tl\u0259ri fonunda u\u011fur qazanma\u011fa k\u00f6m\u0259k etm\u0259k \u00fc\u00e7\u00fcn n\u0259z\u0259rd\u0259 tutulub. Unutmay\u0131n, investisiya dinamikdir, ona g\u00f6r\u0259 d\u0259 bizi izl\u0259yin!\n\n      ## \u00c7evik \u0130nvestisiyan\u0131n G\u00fcc\u00fc\n\n      \u018fn\u0259n\u0259vi \"al-saxla\" strategiyalar\u0131 rahat ola bil\u0259r, lakin m\u00fcasir s\u00fcr\u0259tl\u0259 inki\u015faf ed\u0259n bazarlarda \u00e7eviklik \u0259sasd\u0131r. \u00c7evik investisiya s\u00fcr\u0259tli bazar d\u0259yi\u015fiklikl\u0259rin\u0259 reaksiya verm\u0259k \u00fc\u00e7\u00fcn \u00e7evikliyi v\u0259 s\u00fcr\u0259tli q\u0259rar q\u0259bul etm\u0259yi vur\u011fulay\u0131r. Bu, f\u00fcrs\u0259tl\u0259rd\u0259n yararlanmaq v\u0259 riskl\u0259ri azaltmaq \u00fc\u00e7\u00fcn portfelinizi aktiv \u015f\u0259kild\u0259 idar\u0259 etm\u0259k v\u0259 bazar \u015f\u0259rtl\u0259rini d\u0259rind\u0259n anlamaq dem\u0259kdir.\n\n      ### \u00c7evik \u0130nvestisiyan\u0131n \u018fsas Prinsipl\u0259ri:\n\n      *   **Aktiv \u0130dar\u0259etm\u0259:** Unudun ki, investisiyan\u0131 bir d\u0259f\u0259 qurub sonra yaddan \u00e7\u0131xarmaq olar. \u00c7evik investisiya bazar tendensiyalar\u0131n\u0131n davaml\u0131 izl\u0259nilm\u0259sini t\u0259l\u0259b edir.\n      *   **Texniki Analiz:** Optimal giri\u015f v\u0259 \u00e7\u0131x\u0131\u015f n\u00f6qt\u0259l\u0259rini m\u00fc\u0259yy\u0259n etm\u0259k \u00fc\u00e7\u00fcn h\u0259r\u0259k\u0259tli ortalamalar v\u0259 Bollinger Bantlar\u0131 kimi al\u0259tl\u0259rd\u0259n istifad\u0259 edin.\n      *   **Diversifikasiya:** X\u00fcsusil\u0259 d\u0259yi\u015fk\u0259n bazarlarda riski azaltmaq \u00fc\u00e7\u00fcn investisiyalar\u0131n\u0131z\u0131 m\u00fcxt\u0259lif aktiv sinifl\u0259rin\u0259 yay\u0131n.\n      *   **Stop-Loss \u018fmrl\u0259ri:** Potensial itkil\u0259ri m\u0259hdudla\u015fd\u0131rmaq \u00fc\u00e7\u00fcn qiym\u0259t m\u00fc\u0259yy\u0259n bir h\u0259dd\u0259 \u00e7atd\u0131qda qiym\u0259tli ka\u011f\u0131z\u0131 satmaq \u00fc\u00e7\u00fcn avtomatla\u015fd\u0131r\u0131lm\u0131\u015f t\u0259limatlar t\u0259tbiq edin.\n      *   **Q\u0131sa Sat\u0131\u015f (Short Selling):** S\u0259hml\u0259ri borc g\u00f6t\u00fcr\u0259r\u0259k, y\u00fcks\u0259k qiym\u0259t\u0259 sataraq v\u0259 a\u015fa\u011f\u0131 qiym\u0259t\u0259 yenid\u0259n alaraq qiym\u0259t d\u00fc\u015fm\u0259l\u0259rind\u0259n qazanc \u0259ld\u0259 edin.\n\n      ## Texnologiya Art\u0131m\u0131 il\u0259 Naviqasiya: Harada \u0130nvestisiya Etm\u0259li\n\n      Texnologiya sektoru S\u00fcni \u0130ntellekt (AI), kiber t\u0259hl\u00fck\u0259sizlik, iqlim texnologiyas\u0131 v\u0259 dig\u0259r sah\u0259l\u0259rd\u0259ki ir\u0259lil\u0259yi\u015fl\u0259rl\u0259 s\u00fcr\u0259tli inki\u015faf\u0131n\u0131 davam etdirir. \u00d6nc\u00fcl texnologiyalar \u00fc\u00e7\u00fcn investisiya iqlimi sabitl\u0259\u015fib v\u0259 bir \u00e7ox hallarda 2024-c\u00fc ild\u0259 b\u0259rpa olunub, bulud v\u0259 k\u0259nar hesablama, bio-m\u00fch\u0259ndislik v\u0259 kosmik texnologiyalar kimi sah\u0259l\u0259rd\u0259 s\u0259hm investisiyalar\u0131 art\u0131b. S\u00fcni \u0130ntellekt\u0259 qlobal x\u0259rcl\u0259rin 2024-c\u00fc ild\u0259n 2028-ci il\u0259 q\u0259d\u0259r 29% illik m\u00fcr\u0259kk\u0259b art\u0131m tempi il\u0259 \u0259h\u0259miyy\u0259tli d\u0259r\u0259c\u0259d\u0259 artaca\u011f\u0131 proqnozla\u015fd\u0131r\u0131l\u0131r.\n\n      ### \u0130zl\u0259nilm\u0259si Vacib Olan \u0130sti Texnologiya Sektorlar\u0131 v\u0259 \u015eirk\u0259tl\u0259r:\n\n      *   **S\u00fcni \u0130ntellekt (AI):** AI s\u0259nayel\u0259r \u00fczr\u0259 innovasiyalara t\u0259kan verir, AI infrastruktura \u0259h\u0259miyy\u0259tli investisiyalar qoyulur. Daha \u00e7ox y\u00f6nl\u00fc sisteml\u0259r \u00fc\u00e7\u00fcn m\u00fcxt\u0259lif AI metodologiyalar\u0131n\u0131 birl\u0259\u015fdir\u0259n Hibrid AI, eksperimentl\u0259rd\u0259n k\u0259nara \u00e7\u0131x\u0131r. Innodata kimi \u015firk\u0259tl\u0259r generativ AI sisteml\u0259ri \u00fc\u00e7\u00fcn y\u00fcks\u0259k keyfiyy\u0259tli t\u0259lim m\u0259lumatlar\u0131 t\u0259min etm\u0259kd\u0259 \u0259sas rol oynay\u0131r.\n      *   **M\u0259lumat M\u0259rk\u0259zl\u0259ri v\u0259 Bulud Hesablama:** M\u0259lumat m\u0259rk\u0259zl\u0259rinin v\u0259 ictimai bulud xidm\u0259tl\u0259rinin geni\u015fl\u0259nm\u0259si, AI innovasiyas\u0131 t\u0259r\u0259find\u0259n idar\u0259 olunan kritik bir tendensiyad\u0131r. \u0130ctimai bulud xidm\u0259tl\u0259rin\u0259 qlobal x\u0259rcl\u0259rin 2024-c\u00fc ild\u0259 805 milyard AB\u015e dollar\u0131na \u00e7ataca\u011f\u0131 v\u0259 2028-ci il\u0259 q\u0259d\u0259r ikiqat artaca\u011f\u0131 proqnozla\u015fd\u0131r\u0131l\u0131r.\n      *   **Kiber T\u0259hl\u00fck\u0259sizlik:** Artan t\u0259hdidl\u0259r v\u0259 geni\u015fl\u0259n\u0259n h\u00fccum s\u0259thi (IoT, generativ AI, bulud hesablama) il\u0259 kiber t\u0259hl\u00fck\u0259sizlik kritik bir texnologiya prioritetidir. Kiber cinay\u0259tkarl\u0131\u011f\u0131n qlobal x\u0259rcinin 2025-ci ild\u0259 10.5 trilyon AB\u015e dollar\u0131na \u00e7ataca\u011f\u0131 proqnozla\u015fd\u0131r\u0131l\u0131r. T\u0259hl\u00fck\u0259sizlik m\u0259hsullar\u0131 bazar\u0131 da s\u00fcr\u0259tl\u0259 b\u00f6y\u00fcy\u00fcr, 2028-ci il\u0259 q\u0259d\u0259r 200 milyard AB\u015e dollar\u0131na \u00e7ataca\u011f\u0131 g\u00f6zl\u0259nilir.\n      *   **Kvant Komp\u00fcteri:** Bu sektor maraqda partlay\u0131\u015f v\u0259 ortaya \u00e7\u0131xan kvant s\u0259hml\u0259rind\u0259 \u0259h\u0259miyy\u0259tli y\u00fcks\u0259li\u015fl\u0259r g\u00f6r\u00fcb.\n      *   **Yar\u0131mke\u00e7iricil\u0259r:** Bu s\u0259naye AI infrastrukturuna qoyulan \u0259sas korporativ investisiyalardan faydalan\u0131b v\u0259 generativ AI-nin \"k\u00fcr\u0259k v\u0259 qazma\" m\u0259rh\u0259l\u0259si yar\u0131mke\u00e7irici v\u0259 hardware \u015firk\u0259tl\u0259rini d\u0259st\u0259kl\u0259m\u0259y\u0259 davam edir.\n      *   **Davaml\u0131 Texnologiyalar/\u0130qlim Texnologiyas\u0131:** Karbon tutulmas\u0131ndan enerji s\u0259m\u0259r\u0259li tikinti materiallar\u0131na q\u0259d\u0259r h\u0259ll\u0259r, davaml\u0131 biznes\u0259 v\u0259 iqlim d\u0259yi\u015fikliyi il\u0259 m\u00fcbariz\u0259y\u0259 artan diqq\u0259t say\u0259sind\u0259 s\u00fcr\u0259t qazan\u0131r.\n      *   **T\u0259chizat Z\u0259nciri \u0130nnovasiyas\u0131:** R\u0259q\u0259msal y\u00fckl\u0259rin ekspeditor xidm\u0259tl\u0259ri v\u0259 real-vaxt izl\u0259m\u0259 vasit\u0259sil\u0259 s\u0259m\u0259r\u0259liliyin v\u0259 \u015f\u0259ffafl\u0131\u011f\u0131n art\u0131r\u0131lmas\u0131 \u0259sasd\u0131r.\n\n      M\u00fc\u0259yy\u0259n s\u0259hml\u0259r\u0259 baxanlar \u00fc\u00e7\u00fcn \"M\u00f6ht\u0259\u015f\u0259m Yeddi\" (Apple, Microsoft, Alphabet, Amazon, NVIDIA, Meta Platforms v\u0259 ya Tesla, ya da Broadcom) innovasiya sah\u0259sind\u0259 bazara r\u0259hb\u0259rlik etm\u0259y\u0259 davam edir. 2025-ci ilin iyul ay\u0131na olan m\u0259lumata g\u00f6r\u0259, bir illik g\u0259lirl\u0259 \u0259n yax\u015f\u0131 n\u0259tic\u0259 g\u00f6st\u0259r\u0259n texnologiya s\u0259hml\u0259rind\u0259n b\u0259zil\u0259ri Palantir Technologies (PLTR), MicroStrategy (MSTR), Fortinet (FTNT), Shopify (SHOP), Broadcom (AVGO), Zscaler (ZS) v\u0259 Cisco Systems (CSCO)-dur.\n\n      ## Qlobal Valyuta Dinamikalar\u0131n\u0131 Anlamaq\n\n      Valyuta d\u0259yi\u015fk\u0259nliyi, x\u00fcsusil\u0259 qlobal s\u0259viyy\u0259d\u0259 investisiya ed\u0259rk\u0259n, investisiya g\u0259lirl\u0259riniz\u0259 \u0259h\u0259miyy\u0259tli d\u0259r\u0259c\u0259d\u0259 t\u0259sir g\u00f6st\u0259r\u0259 bil\u0259r. Valyuta m\u0259z\u0259nn\u0259l\u0259rind\u0259ki d\u0259yi\u015fiklikl\u0259r xarici aktiv d\u0259y\u0259rl\u0259rini \u00f6z valyutan\u0131za \u00e7evir\u0259rk\u0259n g\u0259lirl\u0259rinizi azalda v\u0259 ya art\u0131ra bil\u0259r. Portfeliniz yaln\u0131z yerli s\u0259hml\u0259rd\u0259n ibar\u0259t olsa bel\u0259, h\u0259min \u015firk\u0259tl\u0259r \u0259h\u0259miyy\u0259tli beyn\u0259lxalq biznes f\u0259aliyy\u0259ti g\u00f6st\u0259rirs\u0259, dolay\u0131 valyuta riski m\u00f6vcud ola bil\u0259r.\n\n      ### Valyuta D\u0259yi\u015fiklikl\u0259rin\u0259 T\u0259sir Ed\u0259n Faktorlar:\n\n      *   **\u0130qtisadi G\u00fcc:** Bir \u00f6lk\u0259nin iqtisadi perspektivl\u0259rin\u0259 inam, ad\u0259t\u0259n valyutas\u0131na t\u0259l\u0259bat\u0131n artmas\u0131na v\u0259 d\u0259y\u0259rinin y\u00fcks\u0259lm\u0259sin\u0259 s\u0259b\u0259b olur.\n      *   **Faiz D\u0259r\u0259c\u0259l\u0259ri:** Daha y\u00fcks\u0259k faiz d\u0259r\u0259c\u0259l\u0259ri bir \u00f6lk\u0259d\u0259 investisiyalar\u0131 daha c\u0259lbedici ed\u0259 bil\u0259r, xarici kapital\u0131 c\u0259lb ed\u0259r\u0259k yerli valyutan\u0131n d\u0259y\u0259rini potensial olaraq art\u0131ra bil\u0259r. Bununla bel\u0259, \u00e7ox y\u00fcks\u0259k faiz d\u0259r\u0259c\u0259l\u0259ri iqtisadi art\u0131m\u0131 da l\u0259ngid\u0259 bil\u0259r.\n      *   **Ticar\u0259t Balanslar\u0131:** \u0130xracat\u0131 idxalat\u0131ndan \u00e7ox olan bir \u00f6lk\u0259, mallar\u0131na v\u0259 n\u0259tic\u0259d\u0259 valyutas\u0131na olan y\u00fcks\u0259k t\u0259l\u0259bat s\u0259b\u0259bind\u0259n valyutas\u0131n\u0131n d\u0259y\u0259r qazanmas\u0131na s\u0259b\u0259b ola bil\u0259r.\n      *   **Siyasi M\u00fchit v\u0259 Bazar Duy\u011fusu:** B\u00f6y\u00fck qlobal hadis\u0259l\u0259r, siyasi qeyri-sabitlik v\u0259 \u00fcmumi investor m\u00fcnasib\u0259ti q\u0259fil valyuta d\u0259yi\u015fiklikl\u0259ri yarada bil\u0259r.\n\n      ### Valyuta Riskini \u0130dar\u0259 Etm\u0259k \u00fc\u00e7\u00fcn Strategiyalar:\n\n      *   **Hedcinq Texnikalar\u0131:** M\u00fcbadil\u0259 m\u0259z\u0259nn\u0259sini kilidl\u0259m\u0259k v\u0259 qeyri-m\u00fc\u0259yy\u0259nliyi azaltmaq \u00fc\u00e7\u00fcn forvard m\u00fcqavil\u0259l\u0259ri, valyuta fyu\u00e7ersl\u0259ri v\u0259 ya opsionlar kimi maliyy\u0259 al\u0259tl\u0259rind\u0259n istifad\u0259 edin.\n      *   **Valyuta Ekspozisiyas\u0131n\u0131n Diversifikasiyas\u0131:** \u0130nvestisiyalar\u0131 m\u00fcxt\u0259lif valyutalara yay\u0131n. Bu yolla, bir valyutan\u0131n d\u00fc\u015fm\u0259si ba\u015fqas\u0131n\u0131n y\u00fcks\u0259lm\u0259si il\u0259 tarazla\u015fd\u0131r\u0131la bil\u0259r.\n      *   **Valyuta Hedcinqli \u0130nvestisiya Vasit\u0259l\u0259ri:** Valyuta d\u0259yi\u015fk\u0259nliyini avtomatik olaraq kompensasiya etm\u0259k \u00fc\u00e7\u00fcn n\u0259z\u0259rd\u0259 tutulmu\u015f qar\u015f\u0131l\u0131ql\u0131 fondlar\u0131 v\u0259 ya ETF-l\u0259ri n\u0259z\u0259rd\u0259n ke\u00e7irin.\n      *   **\u00c7ox Valyutal\u0131 Hesablar:** F\u0259rqli valyutalar\u0131 saxlamaq m\u00fcbadil\u0259 riskini idar\u0259 etm\u0259y\u0259 k\u00f6m\u0259k ed\u0259 bil\u0259r.\n      *   **M\u0259lumatl\u0131 Olun:** Vaxt\u0131nda q\u0259rarlar q\u0259bul etm\u0259k \u00fc\u00e7\u00fcn valyuta tendensiyalar\u0131n\u0131 v\u0259 bazar x\u0259b\u0259rl\u0259rini izl\u0259m\u0259k \u00fc\u00e7\u00fcn al\u0259tl\u0259rd\u0259n v\u0259 proqramlardan istifad\u0259 edin.\n      *   **G\u00fccl\u00fc \u0130qtisadiyyatlara/Stabil Valyutalara \u0130nvestisiya Edin:** Stabil valyutalar\u0131 v\u0259 g\u00fccl\u00fc iqtisadiyyatlar\u0131 olan \u00f6lk\u0259l\u0259ri se\u00e7m\u0259k, ekstremal d\u0259yi\u015fiklikl\u0259rin t\u0259sirini azalda bil\u0259r. \u0130nki\u015faf etmi\u015f bazarlar ad\u0259t\u0259n daha \u00e7ox valyuta sabitliyi t\u0259klif edir.\n\n      Texnologiya inki\u015faf\u0131n\u0131n v\u0259 valyuta dinamikas\u0131n\u0131n h\u00f6km s\u00fcrd\u00fcy\u00fc bir d\u00fcnyada \u00e7evik investisiya daimi ay\u0131ql\u0131q v\u0259 s\u00fcr\u0259tli q\u0259rar q\u0259bul etm\u0259yi t\u0259l\u0259b edir. \u018fsas tendensiyalar\u0131 ba\u015fa d\u00fc\u015f\u0259r\u0259k v\u0259 a\u011f\u0131ll\u0131 strategiyalar t\u0259tbiq ed\u0259r\u0259k, bazar d\u0259yi\u015fk\u0259nliyini g\u0259lirli f\u00fcrs\u0259tl\u0259r\u0259 \u00e7evirm\u0259k \u00fc\u00e7\u00fcn \u00f6z\u00fcn\u00fcz\u00fc m\u00f6vqel\u0259ndir\u0259 bil\u0259rsiniz. M\u0259lumatl\u0131 qal\u0131n, \u00e7evik olun v\u0259 u\u011furlu investisiyalar!\n    </content_az>\n  </article>\n</response_article>\nI hope you liked it!\n\"\"\"\n\n# Yeah, big text... It is LLM response to our enhanced prompt. We are going to use this text to extract some information about the product.\nprint(data.article.title_az)\n# \"\u00c7evik \u0130nvestisiya: Haz\u0131rk\u0131 Texnologiya Art\u0131mlar\u0131ndan v\u0259 Qlobal Valyuta Dinamikalar\u0131ndan Faydalanma\"\n# \n``` \n\n## \ud83d\udcda Comprehensive Examples\n\n### 1. Basic Text Extraction\n\n#### JSON Extraction with Fallback Strategies\n\n```python\nfrom dataclasses import dataclass\nfrom typing import List, Optional\nfrom textlasso import extract\n\n@dataclass\nclass Product:\n    name: str\n    price: float\n    category: str\n    in_stock: bool\n    tags: Optional[List[str]] = None\n\n# Works with clean JSON\nclean_json = '{\"name\": \"Laptop\", \"price\": 999.99, \"category\": \"Electronics\", \"in_stock\": true}'\n\n# Works with markdown-wrapped JSON\nmarkdown_json = \"\"\"\nHere's your product data:\n```json\n{\n    \"name\": \"Wireless Headphones\",\n    \"price\": 199.99,\n    \"category\": \"Electronics\", \n    \"in_stock\": false,\n    \"tags\": [\"wireless\", \"bluetooth\", \"noise-canceling\"]\n}\n\\```\n\"\"\"\n\n# Works with messy responses\nmessy_response = \"\"\"\nLet me extract that product information for you...\n\nThe product details are: {\"name\": \"Smart Watch\", \"price\": 299.99, \"category\": \"Wearables\", \"in_stock\": true}\n\nIs this what you were looking for?\n\"\"\"\n\n# All of these work automatically\nproducts = [\n    extract(clean_json, Product, extract_strategy='json'),\n    extract(markdown_json, Product, extract_strategy='json'), \n    extract(messy_response, Product, extract_strategy='json')\n]\n\nfor product in products:\n    print(f\"{product.name}: ${product.price} ({'\u2705' if product.in_stock else '\u274c'})\")\n```\n\n#### XML Extraction\n\n```python\nfrom dataclasses import dataclass\nfrom typing import List, Optional\nfrom textlasso import extract\n\n@dataclass \nclass Address:\n    street: str\n    city: str\n    country: str\n    zip_code: Optional[str] = None\n    \n@dataclass\nclass ResponseAddress:\n    address: Address\n\nxml_data = \"\"\"\n<address>\n    <street>123 Main St</street>\n    <city>San Francisco</city>\n    <country>USA</country>\n    <zip_code>94102</zip_code>\n</address>\n\"\"\"\n\nresponse_address = extract(xml_data, ResponseAddress, extract_strategy='xml')\nprint(f\"Address: {response_address.address.street}, {response_address.address.city}, {response_address.address.country}\")\n# Address: 123 Main St, San Francisco, USA\n```\n\n### 2. Complex Nested Data Structures\n\n```python\nfrom dataclasses import dataclass\nfrom typing import List, Optional\nfrom enum import Enum\n\nclass Department(Enum):\n    ENGINEERING = \"engineering\"\n    MARKETING = \"marketing\" \n    SALES = \"sales\"\n    HR = \"hr\"\n\n@dataclass\nclass Employee:\n    id: int\n    name: str\n    department: Department\n    salary: float\n    skills: List[str]\n    manager_id: Optional[int] = None\n\n@dataclass\nclass Company:\n    name: str\n    founded_year: int\n    employees: List[Employee]\n    headquarters: Address\n\ncomplex_json = \"\"\"\n{\n    \"name\": \"TechCorp Inc\",\n    \"founded_year\": 2015,\n    \"headquarters\": {\n        \"street\": \"100 Tech Plaza\",\n        \"city\": \"Austin\", \n        \"country\": \"USA\",\n        \"zip_code\": \"78701\"\n    },\n    \"employees\": [\n        {\n            \"id\": 1,\n            \"name\": \"Sarah Chen\", \n            \"department\": \"engineering\",\n            \"salary\": 120000,\n            \"skills\": [\"Python\", \"React\", \"AWS\"],\n            \"manager_id\": null\n        },\n        {\n            \"id\": 2,\n            \"name\": \"Mike Rodriguez\",\n            \"department\": \"marketing\", \n            \"salary\": 85000,\n            \"skills\": [\"SEO\", \"Content Strategy\", \"Analytics\"],\n            \"manager_id\": 1\n        }\n    ]\n}\n\"\"\"\n\ncompany = extract(complex_json, Company, extract_strategy='json')\nprint(f\"Company: {company.name} ({company.founded_year})\")\nprint(f\"HQ: {company.headquarters.city}, {company.headquarters.country}\")\nprint(f\"Employees: {len(company.employees)}\")\n\nfor emp in company.employees:\n    print(f\"  - {emp.name} ({emp.department.value}): {', '.join(emp.skills)}\")\n\n# HQ: Austin, USA\n# Employees: 2\n#   - Sarah Chen (engineering): Python, React, AWS\n#   - Mike Rodriguez (marketing): SEO, Content Strategy, Analytics\n```\n\n### 3. LLM Response Cleaning\n\n```python\nfrom textlasso.cleaners import clear_llm_res\n\n# Clean various LLM response formats\nmessy_responses = [\n    \"\\```json\\\\n{\\\"key\\\": \\\"value\\\"}\\\\n\\```\",\n    \"\\```\\\\n{\\\"key\\\": \\\"value\\\"}\\\\n\\```\", \n    \"Here's the data: {\\\"key\\\": \\\"value\\\"} hope it helps!\",\n    \"\\```xml\\\\n<root><item>data</item></root>\\\\n\\```\"\n]\n\nfor response in messy_responses:\n    clean_json = clear_llm_res(response, extract_strategy='json')\n    clean_xml = clear_llm_res(response, extract_strategy='xml')\n    print(f\"Original: {response}\")\n    print(f\"JSON cleaned: {clean_json}\")\n    print(f\"XML cleaned: {clean_xml}\")\n    print(\"---\")\n```\n\n### 4. Advanced Data Extraction with Configuration\n\n```python\nfrom textlasso import extract_from_dict\nimport logging\n\n# Configure custom logging\nlogger = logging.getLogger(\"my_extractor\")\nlogger.setLevel(logging.DEBUG)\n\n@dataclass\nclass FlexibleData:\n    required_field: str\n    optional_field: Optional[str] = None\n    number_field: int = 0\n\n# Strict mode - raises errors on type mismatches\ndata_with_extra = {\n    \"required_field\": \"test\",\n    \"optional_field\": \"optional\", \n    \"number_field\": \"123\",  # String instead of int\n    \"extra_field\": \"ignored\"  # Extra field\n}\n\n# Strict mode (default)\ntry:\n    result_strict = extract_from_dict(\n        data_with_extra, \n        FlexibleData,\n        strict_mode=True,\n        ignore_extra_fields=True,\n        logger=logger\n    )\n    print(\"Strict mode result:\", result_strict)\nexcept Exception as e:\n    print(\"Strict mode error:\", e)\n\n# Flexible mode - attempts conversion\nresult_flexible = extract_from_dict(\n    data_with_extra,\n    FlexibleData, \n    strict_mode=False,\n    ignore_extra_fields=True,\n    logger=logger\n)\nprint(\"Flexible mode result:\", result_flexible)\n```\n\n### 5. Structured Prompt Generation\n\n#### Basic Prompt Generation\n\nPrompt Generator(actually - builder) functions return `Prompt` object, which contains expected data structure, prompt and shortcut for data extraction.\n- `prompt.prompt` - enhanced prompt;\n- `prompt.prompt_original` - original prompt;\n- `prompt.schema` - dataclass with expected structure;\n- `prompt.strategy` - extraction strategy (json/xml);\n- `prompt.extract(\"<text to extract>\")` - extract data from prompt;\n- `prompt.data` - initially is `None`, then `.extract()` overwrites it;\n- `prompt.has_data()` - returns True if `prompt.prompt` contains data;\n\n\n**Note:** `prompt.extract()` method is a shortcut for `textlasso.extract()` function.\n\n\n```python\nfrom textlasso import generate_structured_prompt\n\n@dataclass\nclass UserFeedback:\n    rating: int\n    comment: str\n    category: str\n    recommended: bool\n    issues: Optional[List[str]] = None\n\n# Generate a structured prompt\nprompt = generate_structured_prompt(\n    prompt=\"Analyze this customer review and extract structured feedback\",\n    schema=UserFeedback,\n    strategy=\"json\",\n    include_schema_description=True,\n    example_count=2\n)\n\nprint(prompt.prompt)\n# Output:\n# Analyze this customer review and extract structured feedback\n\n# ## OUTPUT FORMAT REQUIREMENTS\n\n# You must respond with a valid JSON object that follows this exact structure:\n\n# ### Schema: UserFeedback\n# - **rating**: int (required)\n# - **comment**: str (required)\n# - **category**: str (required)\n# - **recommended**: bool (required)\n# - **issues**: Array of str (optional)\n\n\n# ### JSON Format Rules:\n# - Use proper JSON syntax with double quotes for strings\n# - Include all required fields\n# - Use null for optional fields that are not provided\n# - Arrays should contain objects matching the specified structure\n# - Numbers should not be quoted\n# - Booleans should be true/false (not quoted)\n\n\n# ## EXAMPLES\n\n# Here are 2 examples of the expected JSON format:\n\n# ### Example 1:\n# ```json\n# {\n#   \"rating\": 1,\n#   \"comment\": \"example_comment_1\",\n#   \"category\": \"example_category_1\",\n#   \"recommended\": true,\n#   \"issues\": [\n#     \"example_issues_item_1\",\n#     \"example_issues_item_2\"\n#   ]\n# }\n# ```\n\n# ### Example 2:\n# ```json\n# {\n#   \"rating\": 2,\n#   \"comment\": \"example_comment_2\",\n#   \"category\": \"example_category_2\",\n#   \"recommended\": false,\n#   \"issues\": [\n#     \"example_issues_item_1\",\n#     \"example_issues_item_2\",\n#     \"example_issues_item_3\"\n#   ]\n# }\n# ```\n\n# Remember: Your response must be valid JSON that matches the specified structure exactly.\n```\n\n#### Using the Decorator for Function Enhancement\nIf you have a prompt returning functions, you can use the `@structured_output` decorator to automatically enhance your prompts with structure requirements.\n\n```python\nfrom dataclasses import dataclass\nfrom typing import Optional, List\n\nfrom textlasso import structured_output\n\n@dataclass\nclass NewsArticle:\n    title: str\n    summary: str\n    category: str\n    sentiment: str\n    key_points: List[str]\n    publication_date: Optional[str] = None\n\n# decorate prompt-returning function\n@structured_output(schema=NewsArticle, strategy=\"xml\", example_count=1)\ndef create_article_analysis_prompt(article_text: str) -> str:\n    return f\"\"\"\n    Analyze the following news article and extract key information:\n    \n    Article: {article_text}\n    \n    Please provide a comprehensive analysis focusing on the main themes,\n    sentiment, and key takeaways.\n    \"\"\"\n\n# The decorator automatically enhances your prompt with structure requirements\narticle_text = \"Breaking: New AI breakthrough announced by researchers...\"\nenhanced_prompt = create_article_analysis_prompt(article_text)\n\n# This prompt now includes schema definitions, examples, and format requirements\nprint(\"Enhanced prompt: \", enhanced_prompt.prompt)\n\n# Enhanced prompt:  \n#     Analyze the following news article and extract key information:\n    \n#     Article: Breaking: New AI breakthrough announced by researchers...\n    \n#     Please provide a comprehensive analysis focusing on the main themes,\n#     sentiment, and key takeaways.\n    \n\n\n# ## OUTPUT FORMAT REQUIREMENTS\n\n# You must respond with a valid XML object that follows this exact structure:\n\n# ### Schema: NewsArticle\n# - **title**: str (required)\n# - **summary**: str (required)\n# - **category**: str (required)\n# - **sentiment**: str (required)\n# - **key_points**: Array of str (required)\n# - **publication_date**: str (optional)\n\n\n# ### XML Format Rules:\n# - Use proper XML syntax with opening and closing tags\n# - Root element should match the main dataclass name\n# - Use snake_case for element names\n# - For arrays, repeat the element name for each item\n# - Use self-closing tags for null/empty optional fields\n# - Include all required fields as elements\n```\n\n### 6. Real-World Use Cases\n\n#### Processing Survey Responses\n\n```python\n@dataclass\nclass SurveyResponse:\n    respondent_id: str\n    age_group: str\n    satisfaction_rating: int\n    feedback: str\n    would_recommend: bool\n    improvement_areas: List[str]\n\n# Simulating LLM processing of survey data\nllm_survey_output = \"\"\"\nBased on the survey response, here's the extracted data:\n\n\\```json\n{\n    \"respondent_id\": \"RESP_001\",\n    \"age_group\": \"25-34\", \n    \"satisfaction_rating\": 4,\n    \"feedback\": \"Great service overall, but could improve response time\",\n    \"would_recommend\": true,\n    \"improvement_areas\": [\"response_time\", \"pricing\"]\n}\n\\```\n\nThis response indicates positive sentiment with specific improvement suggestions.\n\"\"\"\n\nsurvey = extract(llm_survey_output, SurveyResponse, extract_strategy='json')\nprint(survey)\n# SurveyResponse(respondent_id='RESP_001', age_group='25-34', satisfaction_rating=4, feedback='Great service overall, but could improve response time', would_recommend=True, improvement_areas=['response_time', 'pricing'])\n```\n\n#### E-commerce Product Extraction\n\n```python\n@dataclass\nclass ProductReview:\n    product_id: str\n    reviewer_name: str\n    rating: int\n    review_text: str\n    verified_purchase: bool\n    helpful_votes: int\n    review_date: str\n\n@structured_output(schema=ProductReview, strategy=\"xml\")\ndef create_review_extraction_prompt(raw_review: str) -> str:\n    return f\"\"\"\n        Extract structured information from this product review:\n        \n        {raw_review}\n        \n        Pay attention to implicit ratings, sentiment, and any verification indicators. \n        \"\"\"\n\nraw_review = \"\"\"\n    \u2605\u2605\u2605\u2605\u2606 Amazing headphones! by John D. (Verified Purchase) - March 15, 2024\n    These headphones exceeded my expectations. Great sound quality and comfortable fit.\n    Battery life could be better but overall very satisfied. Would definitely buy again!\n    \ud83d\udc4d 47 people found this helpful\n    \"\"\"\n\nextraction_prompt = create_review_extraction_prompt(raw_review)\n# Send this prompt to your LLM, then extract the response:\n# llm = some LLM model\nllm_response = llm.invoke(extraction_prompt.prompt)\nreview = extract(llm_response, ProductReview, extract_strategy='xml')\n# or\nextraction_prompt.extract(llm_response)\n```\n\n## \ud83d\udd27 Configuration Options\n\n### Extraction Configuration\n\n```python\nfrom textlasso import extract_from_dict\nimport logging\n\n# Configure extraction behavior\nresult = extract_from_dict(\n    data_dict=your_data,\n    target_class=YourDataClass,\n    strict_mode=False,          # Allow type conversions\n    ignore_extra_fields=True,   # Ignore unknown fields\n    logger=custom_logger,       # Custom logging\n    log_level=logging.DEBUG     # Detailed logging\n)\n```\n\n### Prompt Generation Configuration\n\n```python\nfrom textlasso import generate_structured_prompt\n\nprompt = generate_structured_prompt(\n    prompt=\"Your base prompt\",\n    schema=YourSchema,\n    strategy=\"json\",                    # or \"xml\"\n    include_schema_description=True,    # Include field descriptions\n    example_count=3                     # Number of examples (1-3)\n)\n```\n\n## \ud83d\udcd6 API Reference\n\n### Core Functions\n\n#### `extract(text, target_class, extract_strategy='json')`\nExtract structured data from text.\n\n**Parameters:**\n- `text` (str): Raw text containing data to extract\n- `target_class` (type): Dataclass to convert data into\n- `extract_strategy` (Literal['json', 'xml']): Extraction strategy\n\n**Returns:** Instance of `target_class`\n\n#### `extract_from_dict(data_dict, target_class, **options)`\nConvert dictionary to dataclass with advanced options.\n\n#### `generate_structured_prompt(prompt, schema, strategy, **options)`\nGenerate enhanced prompts with structure requirements.\n\n### Decorators\n\n#### `@structured_output(schema, strategy='json', **options)`\nEnhance prompt functions with structured output requirements.\n\n#### `@chain_prompts(*prompt_funcs, separator='\\n\\n---\\n\\n')`\nChain multiple prompt functions together.\n\n#### `@prompt_cache(maxsize=128)`\nCache prompt results for better performance.\n\n### Utilities\n\n#### `clear_llm_res(text, extract_strategy)`\nClean LLM responses by removing code blocks and formatting.\n\n## \ud83e\udd1d Contributing\n\nWe welcome contributions! Here's how to get started:\n\n1. Fork the repository\n2. Create a feature branch: `git checkout -b feature-name`\n3. Make your changes and add tests\n4. Run tests: `pytest`\n5. Submit a pull request\n\n## \ud83d\udcc4 License\n\nThis project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.\n\n## \ud83d\ude4f Acknowledgments\n\n- Built for the AI/LLM community\n- Inspired by the need for robust text processing in AI applications\n- Special thanks to all contributors and users\n\n## \ud83d\udcde Support\n\n- \ud83d\udce7 Email: aziznadirov@yahoo.com\n- \ud83d\udc1b Issues: [GitHub Issues](https://github.com/AzizNadirov/textlasso/issues)\n\n---\n\n**TextLasso** - Wrangle your text data with ease! \ud83e\udd20\n",
    "bugtrack_url": null,
    "license": null,
    "summary": "Simple packege for working with LLM text responses and prompts.",
    "version": "0.1.3",
    "project_urls": {
        "Homepage": "https://github.com/AzizNadirov/textlasso",
        "Repository": "https://github.com/AzizNadirov/textlasso.git"
    },
    "split_keywords": [
        "llm",
        " text",
        " crawl",
        " extract",
        " text-cleaning"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "a5899aa5355de2d2a466c31b3e23fdef79143519815f990d587bdebb645fd07d",
                "md5": "257559b80193f334484738fe479ee6c7",
                "sha256": "df0e6610a2679d619705a87b5def82547a46c62b4ae1fb29cc046d4f7e0d5ef7"
            },
            "downloads": -1,
            "filename": "textlasso-0.1.3-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "257559b80193f334484738fe479ee6c7",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.9",
            "size": 29667,
            "upload_time": "2025-07-31T13:45:57",
            "upload_time_iso_8601": "2025-07-31T13:45:57.518683Z",
            "url": "https://files.pythonhosted.org/packages/a5/89/9aa5355de2d2a466c31b3e23fdef79143519815f990d587bdebb645fd07d/textlasso-0.1.3-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "21b7d626b649f12a96ed36a056bb898563b577c520e5dc88f131fa87c7760577",
                "md5": "b69639334ad205f6d780562f3b3723f6",
                "sha256": "1f64dd834a58d2b6e6f8b0f5c969fcdd50a92b1c2f619bfa9b9bc81ea17d468f"
            },
            "downloads": -1,
            "filename": "textlasso-0.1.3.tar.gz",
            "has_sig": false,
            "md5_digest": "b69639334ad205f6d780562f3b3723f6",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.9",
            "size": 52125,
            "upload_time": "2025-07-31T13:45:59",
            "upload_time_iso_8601": "2025-07-31T13:45:59.332447Z",
            "url": "https://files.pythonhosted.org/packages/21/b7/d626b649f12a96ed36a056bb898563b577c520e5dc88f131fa87c7760577/textlasso-0.1.3.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-07-31 13:45:59",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "AzizNadirov",
    "github_project": "textlasso",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "lcname": "textlasso"
}
        
Elapsed time: 1.09342s