File size: 20,413 Bytes
ce49379
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
# System Architecture

> **Meeting Intelligence Agent - Technical Architecture Documentation**

This document provides a high-level overview of the system architecture, design decisions, and component relationships for the Meeting Intelligence Agent.

---

## πŸ“‹ Table of Contents

- [System Overview](#-system-overview)
- [Architecture Diagram](#-architecture-diagram)
- [Core Components](#-core-components)
- [Data Flow](#-data-flow)
- [Key Design Decisions](#-key-design-decisions)
- [State Management](#-state-management)
- [Scalability & Performance](#-scalability--performance)
- [Security Architecture](#-security-architecture)
- [Technology Stack](#-technology-stack)

---

## 🎯 System Overview

The Meeting Intelligence Agent is a **conversational AI system** built on LangGraph that orchestrates meeting video processing, transcription, storage, and intelligent querying through natural language interaction.

### Core Capabilities

1. **Video Processing Pipeline**: Upload β†’ Transcription β†’ Speaker Diarization β†’ Metadata Extraction β†’ Vector Storage
2. **Semantic Search**: RAG-based querying across meeting transcripts using natural language
3. **External Integrations**: MCP (Model Context Protocol) servers for Notion and time-aware queries
4. **Conversational Interface**: Gradio-based chat UI with file upload support

### Design Philosophy

- **Conversational-First**: All functionality accessible through natural language
- **Modular Architecture**: Clear separation between UI, agent, tools, and services
- **Extensible**: MCP protocol enables easy addition of new capabilities
- **Async-Ready**: Supports long-running operations (transcription, MCP calls)
- **Production-Ready**: Docker support, error handling, graceful degradation

---

## πŸ—οΈ Architecture Diagram

```mermaid
graph TB
    subgraph "Frontend Layer"
        UI[Gradio Interface]
        Chat[Chat Component]
        Upload[File Upload]
        Editor[Transcript Editor]
    end
    
    subgraph "Agent Layer (LangGraph)"
        Agent[Conversational Agent]
        StateMachine[State Machine]
        ToolRouter[Tool Router]
    end
    
    subgraph "Tool Layer"
        VideoTools[Video Processing Tools]
        QueryTools[Meeting Query Tools]
        MCPTools[MCP Integration Tools]
    end
    
    subgraph "Processing Layer"
        WhisperX[WhisperX Transcription]
        Pyannote[Speaker Diarization]
        MetadataExtractor[GPT-4o-mini Metadata]
        Embeddings[OpenAI Embeddings]
    end
    
    subgraph "Storage Layer"
        Pinecone[(Pinecone Vector DB)]
        LocalState[Local State Cache]
    end
    
    subgraph "External Services"
        OpenAI[OpenAI API]
        NotionMCP[Notion MCP Server]
        TimeMCP[Time MCP Server]
        ZoomMCP[Zoom MCP Server<br/>In Development]
    end
    
    UI --> Agent
    Agent --> StateMachine
    StateMachine --> ToolRouter
    ToolRouter --> VideoTools
    ToolRouter --> QueryTools
    ToolRouter --> MCPTools
    
    VideoTools --> WhisperX
    VideoTools --> Pyannote
    VideoTools --> MetadataExtractor
    VideoTools --> Embeddings
    
    QueryTools --> Embeddings
    QueryTools --> Pinecone
    
    MCPTools --> NotionMCP
    MCPTools --> TimeMCP
    MCPTools -.-> ZoomMCP
    
    Embeddings --> Pinecone
    MetadataExtractor --> OpenAI
    Agent --> OpenAI
    
    classDef frontend fill:#e1f5fe,stroke:#01579b,stroke-width:2px,color:#000000
    classDef agent fill:#e8f5e9,stroke:#2e7d32,stroke-width:2px,color:#000000   
    classDef tools fill:#fff3e0,stroke:#ef6c00,stroke-width:2px,color:#000000
    classDef processing fill:#f3e5f5,stroke:#7b1fa2,stroke-width:2px,color:#000000
    classDef storage fill:#fff9c4,stroke:#fbc02d,stroke-width:2px,color:#000000
    classDef external fill:#ffebee,stroke:#c62828,stroke-width:2px,color:#000000
    
    class UI,Chat,Upload,Editor frontend
    class Agent,StateMachine,ToolRouter agent
    class VideoTools,QueryTools,MCPTools tools
    class WhisperX,Pyannote,MetadataExtractor,Embeddings processing
    class Pinecone,LocalState storage
    class OpenAI,NotionMCP,TimeMCP,ZoomMCP external
```

---

## 🧩 Core Components

### 1. Frontend Layer (Gradio)

**Purpose**: User interface for interaction and file management

**Components**:
- **Chat Interface**: Primary conversational UI using `gr.ChatInterface`
- **File Upload**: Video file upload widget
- **Transcript Editor**: Editable text area for manual corrections
- **State Display**: Real-time feedback on processing status

**Technology**: Gradio 5.x with async support

**Key Files**:
- `src/ui/gradio_app.py` - UI component definitions and event handlers

---

### 2. Agent Layer (LangGraph)

**Purpose**: Orchestrates the entire workflow through conversational AI

**Architecture**: State machine with three nodes:

```
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”     β”Œβ”€β”€β”€β”€β”€β”€β”€β”     β”Œβ”€β”€β”€β”€β”€β”€β”€β”
β”‚ PREPARE β”‚ --> β”‚ AGENT β”‚ --> β”‚ TOOLS β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜     β””β”€β”€β”€β”€β”€β”€β”€β”˜     β””β”€β”€β”€β”€β”€β”€β”€β”˜
                    ↑             β”‚
                    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
```

**Components**:

1. **Prepare Node**: Converts chat history to LangChain messages
2. **Agent Node**: LLM decides which tools to call
3. **Tools Node**: Executes selected tools
4. **Conditional Router**: Determines if more tool calls are needed

**State Structure**:
```python
{
    "message": str,              # Current user query
    "history": List[List[str]],  # Conversation history
    "llm_messages": List[Message], # LangChain message format
    "response": str,             # Generated response
    "error": Optional[str]       # Error tracking
}
```

**Key Files**:
- `src/agents/conversational.py` - LangGraph agent implementation (570 lines)

---

### 3. Tool Layer

**Purpose**: Provides discrete capabilities that the agent can invoke

**Categories**:

#### Video Processing Tools (8 tools)
- File upload management
- Transcription orchestration
- Speaker name mapping
- Transcript editing
- Pinecone upload

#### Meeting Query Tools (6 tools)
- Semantic search
- Metadata retrieval
- Meeting listing
- Text upsert
- Notion import/export

#### MCP Integration Tools (6+ tools)
- Notion API operations
- Time queries
- Future: Zoom RTMS

**Design Pattern**: LangChain `@tool` decorator for automatic schema generation

**Key Files**:
- `src/tools/video.py` - Video processing tools (528 lines)
- `src/tools/general.py` - Query and integration tools (577 lines)
- `src/tools/mcp/` - MCP client wrappers

---

### 4. Processing Layer

**Purpose**: Handles compute-intensive operations

**Components**:

#### WhisperX Transcription
- **Model**: Configurable (tiny/small/medium/large)
- **Features**: Word-level timestamps, language detection
- **Performance**: GPU-accelerated when available

#### Pyannote Speaker Diarization
- **Model**: `pyannote/speaker-diarization-3.1`
- **Output**: Speaker segments with timestamps
- **Integration**: Aligned with WhisperX word timestamps

#### Metadata Extraction
- **Model**: GPT-4o-mini (cost-optimized)
- **Extracts**: Title, date, summary, speaker mapping
- **Format**: Structured JSON output

#### Embeddings
- **Model**: OpenAI `text-embedding-3-small`
- **Dimension**: 1536
- **Usage**: Query and document embedding

**Key Files**:
- `src/processing/transcription.py` - WhisperX + Pyannote pipeline
- `src/processing/metadata_extractor.py` - GPT-4o-mini extraction

---

### 5. Storage Layer

**Purpose**: Persistent and temporary data storage

#### Pinecone Vector Database
- **Type**: Serverless
- **Index**: `meeting-transcripts-1-dev`
- **Namespace**: Environment-based (`development`/`production`)
- **Metadata**: Rich metadata for filtering (title, date, source, speakers)

**Schema**:
```python
{
    "id": "meeting_abc12345_chunk_001",
    "values": [1536-dim embedding],
    "metadata": {
        "meeting_id": "meeting_abc12345",
        "meeting_title": "Q4 Planning",
        "meeting_date": "2024-12-07",
        "summary": "...",
        "speaker_mapping": {...},
        "source": "video",
        "chunk_index": 1,
        "text": "actual transcript chunk"
    }
}
```

#### Local State Cache
- **Purpose**: Temporary storage for video processing workflow
- **Scope**: In-memory, per-session
- **Contents**: Uploaded video path, transcription text, timing info

**Key Files**:
- `src/retrievers/pinecone.py` - Vector database manager

---

### 6. External Services

**Purpose**: Third-party APIs and custom MCP servers

#### OpenAI API
- **Models**: GPT-3.5-turbo (agent), GPT-4o-mini (metadata)
- **Usage**: Agent reasoning, metadata extraction, embeddings

#### Notion MCP Server
- **Type**: Official `@notionhq/notion-mcp-server`
- **Transport**: stdio (local subprocess)
- **Capabilities**: Search, read, create, update pages

#### Time MCP Server (Custom)
- **Type**: Gradio-based MCP server
- **Transport**: SSE (Server-Sent Events)
- **Deployment**: HuggingFace Spaces
- **URL**: `https://gfiamon-date-time-mpc-server-tool.hf.space/gradio_api/mcp/sse`
- **Purpose**: Time-aware query support

#### Zoom RTMS Server (In Development)
- **Type**: FastAPI + Gradio hybrid
- **Transport**: stdio + webhooks
- **Status**: Prototype, API integration pending
- **Purpose**: Live meeting transcription

**Key Files**:
- `src/tools/mcp/mcp_manager.py` - Multi-server MCP client
- `external_mcp_servers/time_mcp_server/` - Custom time server
- `external_mcp_servers/zoom_mcp/` - Zoom RTMS prototype

---

## πŸ”„ Data Flow

### Video Upload Flow

```
User uploads video.mp4
    ↓
Gradio saves to temp directory
    ↓
Agent calls transcribe_uploaded_video(path)
    ↓
WhisperX extracts audio + transcribes
    ↓
Pyannote identifies speakers
    ↓
Alignment: Match speakers to transcript
    ↓
Format: SPEAKER_00, SPEAKER_01, etc.
    ↓
Return formatted transcript to agent
    ↓
Agent shows transcript to user
    ↓
User optionally edits or updates speaker names
    ↓
Agent calls upload_transcription_to_pinecone()
    ↓
GPT-4o-mini extracts metadata
    ↓
Text chunked into semantic segments
    ↓
OpenAI embeddings generated
    ↓
Upsert to Pinecone with metadata
    ↓
Return meeting_id to user
```

### Query Flow

```
User asks: "What action items were assigned last Tuesday?"
    ↓
Agent receives query
    ↓
Agent calls get_time_for_city("Berlin") [Time MCP]
    ↓
Time server returns: "2024-12-07"
    ↓
Agent calculates: "Last Tuesday = 2024-12-03"
    ↓
Agent calls search_meetings(query="action items", date_filter="2024-12-03")
    ↓
Query embedded via OpenAI
    ↓
Pinecone vector search
    ↓
Top-k chunks retrieved with metadata
    ↓
Results returned to agent
    ↓
Agent synthesizes answer from chunks
    ↓
Response streamed to user
```

### Notion Integration Flow

```
User: "Import 'Meeting 3' from Notion"
    ↓
Agent calls import_notion_to_pinecone(query="Meeting 3")
    ↓
Tool calls Notion MCP: API-post-search(query="Meeting 3")
    ↓
Notion returns page_id
    ↓
Tool calls API-retrieve-a-page(page_id) β†’ metadata
    ↓
Tool calls API-get-block-children(page_id) β†’ content blocks
    ↓
Recursive extraction of nested blocks
    ↓
Full text assembled
    ↓
GPT-4o-mini extracts metadata
    ↓
Text chunked and embedded
    ↓
Upsert to Pinecone
    ↓
Return success message with meeting_id
```

---

## 🎨 Key Design Decisions

### 1. Why LangGraph?

**Decision**: Use LangGraph instead of LangChain's AgentExecutor or other frameworks

**Rationale**:
- βœ… **Explicit state management**: Full control over conversation state
- βœ… **Async support**: Required for MCP tools (Notion API)
- βœ… **Debugging**: Clear visibility into state transitions
- βœ… **Flexibility**: Easy to add custom nodes and conditional routing
- βœ… **Streaming**: Native support for response streaming

**Alternative Considered**: LangChain AgentExecutor (rejected due to limited async support)

---

### 2. Why Separate MCP Servers?

**Decision**: Deploy custom MCP servers in `external_mcp_servers/` as standalone applications

**Rationale**:
- βœ… **Independent scaling**: Time server can handle multiple agents
- βœ… **Deployment flexibility**: Update servers without redeploying agent
- βœ… **Development isolation**: Test MCP servers independently
- βœ… **Reusability**: Other projects can use the same MCP servers
- βœ… **Transport options**: HTTP (SSE) for remote, stdio for local

**Architecture**:
```
Main Agent (HF Space 1)
    ↓ HTTP/SSE
Time MCP Server (HF Space 2)
    ↓ HTTP/SSE
Zoom MCP Server (HF Space 3)
```

**Alternative Considered**: Embed MCP servers in main app (rejected due to coupling)

---

### 3. Why Pinecone Serverless?

**Decision**: Use Pinecone serverless for vector storage

**Rationale**:
- βœ… **No infrastructure management**: Fully managed
- βœ… **Cost-effective**: Pay per usage, no idle costs
- βœ… **Scalability**: Auto-scales with demand
- βœ… **Metadata filtering**: Rich filtering capabilities
- βœ… **Namespaces**: Environment isolation (dev/prod)

**Alternative Considered**: Chroma (rejected due to self-hosting requirements)

---

### 4. Why GPT-3.5-turbo for Agent?

**Decision**: Use GPT-3.5-turbo instead of GPT-4 for agent reasoning

**Rationale**:
- βœ… **Cost**: 10x cheaper than GPT-4
- βœ… **Speed**: Faster response times
- βœ… **Sufficient**: Tool calling works well with 3.5-turbo
- βœ… **Budget**: GPT-4o-mini used for metadata extraction (specialized task)

**Cost Comparison** (per 1M tokens):
- GPT-3.5-turbo: $0.50 input / $1.50 output
- GPT-4: $30 input / $60 output
- GPT-4o-mini: $0.15 input / $0.60 output

---

### 5. Why Async Patterns?

**Decision**: Use `async/await` throughout the agent

**Rationale**:
- βœ… **MCP requirement**: Notion MCP tools are async
- βœ… **Long operations**: Transcription can take minutes
- βœ… **Streaming**: Gradio async streaming for better UX
- βœ… **Concurrency**: Handle multiple tool calls efficiently

**Implementation**:
```python
async def generate_response(self, message, history):
    async for event in self.graph.astream(initial_state):
        # Process events
        yield response_chunk
```

---

## πŸ—‚οΈ State Management

### LangGraph State

**Structure**: TypedDict with annotated message list

```python
class ConversationalAgentState(TypedDict):
    message: str                          # Current query
    history: List[List[str]]              # Gradio format
    llm_messages: Annotated[List[Any], add_messages]  # LangChain format
    response: str                         # Generated response
    error: Optional[str]                  # Error tracking
```

**State Transitions**:
1. **Prepare**: `history` β†’ `llm_messages` (format conversion)
2. **Agent**: `llm_messages` β†’ `llm_messages` (append AI response)
3. **Tools**: `llm_messages` β†’ `llm_messages` (append tool results)

**Persistence**: In-memory only, no database persistence (stateless per session)

---

### Video Processing State

**Purpose**: Track video upload workflow across multiple tool calls

**Storage**: Global dictionary in `src/tools/video.py`

```python
_video_state = {
    "uploaded_video_path": None,
    "transcription_text": None,
    "transcription_segments": None,
    "timing_info": None,
    "show_video_upload": False,
    "show_transcription_editor": False,
    "transcription_in_progress": False
}
```

**Lifecycle**:
1. `request_video_upload()` β†’ sets `show_video_upload = True`
2. `transcribe_uploaded_video()` β†’ stores transcript
3. `upload_transcription_to_pinecone()` β†’ clears state

**Reset**: Automatic after successful upload or manual via `cancel_video_workflow()`

---

### UI State Synchronization

**Challenge**: Keep Gradio UI in sync with agent state

**Solution**: Tools return UI state changes via `get_video_state()`

```python
# Tool returns state
state = get_video_state()
return {
    "show_upload": state["show_video_upload"],
    "show_editor": state["show_transcription_editor"],
    "transcript": state["transcription_text"]
}
```

**Gradio Integration**: UI components update based on returned state

---

## ⚑ Scalability & Performance

### Concurrency

**Current**: Single-user sessions (Gradio default)

**Scalability**:
- βœ… Stateless agent (can handle multiple sessions)
- βœ… Pinecone auto-scales
- βœ… MCP servers deployed independently
- ⚠️ WhisperX requires GPU (bottleneck for concurrent transcriptions)

**Future Improvements**:
- Queue system for transcription jobs
- Separate transcription service (microservice)
- Redis for shared state across instances

---

### Caching

**Current Caching**:
- ❌ No LLM response caching
- ❌ No embedding caching
- βœ… Pinecone handles vector index caching

**Future Improvements**:
- Cache frequent queries (e.g., "list meetings")
- Cache embeddings for repeated text
- LangChain cache for LLM responses

---

### Performance Bottlenecks

1. **Transcription**: 2-5 minutes for typical meeting (GPU-dependent)
2. **Metadata Extraction**: 5-10 seconds (GPT-4o-mini API call)
3. **Embedding**: 1-2 seconds per chunk (OpenAI API)
4. **Pinecone Upsert**: 1-3 seconds for typical meeting

**Optimization Strategies**:
- Parallel embedding generation
- Batch Pinecone upserts
- Async MCP calls
- Streaming responses to user

---

## πŸ”’ Security Architecture

### API Key Management

**Storage**: Environment variables via `.env` file

```bash
OPENAI_API_KEY=sk-...
PINECONE_API_KEY=...
NOTION_TOKEN=secret_...
```

**Access**: Loaded via `python-dotenv` in `src/config/settings.py`

**Best Practices**:
- βœ… Never commit `.env` to git (`.gitignore` configured)
- βœ… Use HuggingFace Spaces secrets for deployment
- βœ… Rotate keys regularly

---

### Data Privacy

**User Data**:
- Video files: Stored temporarily, deleted after processing
- Transcripts: Stored in Pinecone (user-controlled index)
- Conversation history: In-memory only, not persisted

**Third-Party Data Sharing**:
- OpenAI: Transcripts sent for embedding/metadata extraction
- Pinecone: Encrypted at rest and in transit
- Notion: Only accessed with user's token

**Compliance**:
- GDPR: User can delete Pinecone index
- Data retention: No long-term storage of raw videos

---

### MCP Server Security

**Notion MCP**:
- Authentication: User's Notion token
- Permissions: Limited to token's access scope
- Transport: stdio (local process, no network exposure)

**Time MCP**:
- Authentication: None required (public API)
- Transport: HTTPS (TLS encrypted)
- Rate limiting: HuggingFace Spaces default limits

**Zoom MCP** (planned):
- Authentication: OAuth 2.0
- Webhook validation: HMAC-SHA256 signature
- Transport: HTTPS + WebSocket (TLS)

---

## πŸ› οΈ Technology Stack

### Core Framework
- **Python**: 3.11+
- **LangGraph**: Agent orchestration
- **LangChain**: Tool abstractions, message handling
- **Gradio**: Web UI framework

### AI/ML Models
- **OpenAI GPT-3.5-turbo**: Agent reasoning
- **OpenAI GPT-4o-mini**: Metadata extraction
- **OpenAI text-embedding-3-small**: Vector embeddings
- **WhisperX**: Speech-to-text transcription
- **Pyannote**: Speaker diarization

### Storage & Databases
- **Pinecone**: Vector database (serverless)
- **Local filesystem**: Temporary video storage

### External Integrations
- **Notion API**: Via MCP server
- **Custom Time API**: Via Gradio MCP server
- **Zoom API** (planned): Via custom MCP server

### Development Tools
- **Docker**: Containerization
- **FFmpeg**: Audio extraction
- **pytest**: Testing (planned)
- **LangSmith**: Tracing and debugging (optional)

### Deployment
- **HuggingFace Spaces**: Primary deployment platform
- **Docker**: Container runtime
- **Environment Variables**: Configuration management

---

## πŸ“š Related Documentation

- [TECHNICAL_IMPLEMENTATION.md](TECHNICAL_IMPLEMENTATION.md) - Detailed tool reference and code examples
- [DEPLOYMENT_GUIDE.md](DEPLOYMENT_GUIDE.md) - Step-by-step deployment instructions
- [README.md](../README.md) - Project overview and quick start

---

## πŸ”„ Version History

- **v4.0** (Current): LangGraph-based conversational agent with MCP integration
- **v3.0**: Experimental agent patterns
- **v2.0**: Basic agent with video processing
- **v1.0**: Initial prototype

---

**Last Updated**: December 5, 2025  
**Maintained By**: Meeting Intelligence Agent Team