Knowledge mining extracts actionable insights from unstructured data like documents or images. Azure AI Search powers intelligent search with AI-driven enrichments, making it perfect for enterprises. Let’s walk through building a knowledge mining solution.
Core Components of Azure AI Search
- Data Source: Blob Storage, SQL Database, Cosmos DB, or JSON documents.
- Skillset: Pipeline of AI skills (e.g., OCR, entity recognition) to enrich data.
- Indexer: Maps data to an index.
- Index: Searchable JSON documents with enriched fields.
Enrichment Pipeline DiagramExample Pipeline
- Document Cracking: Extract text and metadata.
- Enrichment: Apply skills like language detection or OCR.
- Indexing: Map outputs to searchable fields.
Creating Custom Skills
When built-in skills aren’t enough, create custom skills as Web APIs (e.g., Azure Functions):
json{ "@odata.type": "#Microsoft.Skills.Custom.WebApiSkill", "uri": "https://myfunction.azurewebsites.net/api/skill", "inputs": [ { "name": "text", "source": "/document/content" } ], "outputs": [ { "name": "result", "targetName": "customField" } ] }
Use Case: Extract custom entities (e.g., product codes) from invoices.
Creating a Knowledge Store
Persist enrichments in Azure Storage as:
- Tables: Relational data for analytics.
- Objects: JSON structures.
- Files: Extracted images.
Use the Shaper skill to structure projections:
json{ "@odata.type": "#Microsoft.Skills.Util.ShaperSkill", "inputs": [ { "name": "url", "source": "/document/url" } ], "outputs": [ { "name": "output", "targetName": "projection" } ] }
Example: Store key phrases from customer feedback for trend analysis.
Azure AI Search transforms data into searchable insights—ideal for business intelligence!