Document Processing
- Docling / Unstructured / PyMuPDF / Llamaparse / Azure Document Intelligence
Chunking + Metadata
- LangChain/LlamaIndex/Chonkie/Doclings chunkers
- GLiNER for metadata extraction
Embeddings
| import os | |
| import streamlit as st | |
| from crewai import Agent, Task, Crew, LLM | |
| from crewai_tools import SerperDevTool | |
| from dotenv import load_dotenv | |
| # Load environment variables | |
| load_dotenv() | |
| # Streamlit page config |
| from crewai import Crew, Agent, Task, LLM | |
| from crewai_tools import SerperDevTool | |
| from dotenv import load_dotenv | |
| load_dotenv() | |
| topic = "AI in Healthcare" | |
| llm = LLM( | |
| model="gemini/gemini-2.5-flash", |
| You are in charge of client orders. Your job is to take incoming information regarding new orders and give a nice summary that will be emailed to the team. The email should be signed off from “Customer Success Team”. | |
| Here is the information on client orders. | |
| Order ID: | |
| Customer Name: | |
| Product: | |
| Quantity: | |
| Price: | |
| Order Date: | |
| Status: |
| import json | |
| import boto3 | |
| ENDPOINT = "huggingface-pytorch-tgi-inference-" | |
| sagemaker_runtime = boto3.client("sagemaker-runtime", region_name='us-east-1') | |
| def lambda_handler(event, context): | |
| query_params = event['queryStringParameters'] | |
| query = query_params['query'] |
| import json | |
| import boto3 | |
| import botocore.config | |
| from datetime import datetime | |
| ### AWS BEDROCK CALL ### | |
| # { | |
| # "modelId": "meta.llama4-scout-17b-instruct-v1:0", |
| content_writer: | |
| role: > | |
| Educational Content Writer | |
| goal: > | |
| Create engaging, informative content that thoroughly explains the assigned topic | |
| and provides valuable insights to the reader | |
| backstory: > | |
| You are a talented educational writer with expertise in creating clear, engaging | |
| content. You have a gift for explaining complex concepts in accessible language | |
| and organizing information in a way that helps readers build their understanding. |
| # You can use mlflow context manager to log any param or metric values. One example is shown below | |
| with mlflow.start_run(): | |
| mlflow.log_param("param_name", param_value) | |
| mlflow.log_metric("metric_name", metric_value) |