FinanceQuery API Architecture¶

Overview¶

FinanceQuery is a FastAPI-based financial data API that aggregates data from Yahoo Finance API and web scraping. The architecture follows a service-oriented design with clear separation of concerns, emphasizing modularity, asynchronous operations, and resilience.

Core Components¶

FastAPI Application (`src/main.py`)¶

The main application handles:

Lifespan Management: Initializes shared resources (HTTP sessions, Redis, authentication, proxy setup)
Middleware Stack: CORS, request context, logging, and optional rate limiting
Error Handling: Custom validation error formatting
Routing: API endpoints under /v1/ prefix with Pydantic validation
Health Checks: /ping (basic) and /health (comprehensive service validation)
AWS Lambda Support: Via Mangum adapter

Data Models (`src/models/`)¶

Pydantic models provide: - API Contract Definition: Consistent request/response structures - Input Validation: Automatic parameter validation - Documentation Generation: OpenAPI schema generation - Type Safety: Runtime type checking and IDE support

Routing System (`src/routes/`)¶

Modular routers organize endpoints by domain: - quotes - Stock quotes and company information - historical_prices - Historical price data - indices - Market indices (S&P 500, NASDAQ, DOW) - sectors - Industry sector performance - movers - Market gainers/losers/most active - finance_news - Financial news aggregation - search - Symbol search functionality - hours - Market hours status - similar - Similar securities lookup - indicators - Technical indicators (SMA, EMA, RSI, MACD, etc.) - financials - Financial statements (income, balance sheet, cash flow) - holders - Institutional and insider holdings data - analysis - Analyst recommendations, price targets, earnings estimates - earnings_transcript - Earnings call transcripts - stream - Server-sent events - sockets - WebSocket connections

Logging & Middleware Architecture¶

Logging System (`src/utils/logging.py`)¶

Comprehensive logging with configurable levels and formats:

Environment Configuration:
LOG_LEVEL: DEBUG/INFO/WARNING/ERROR/CRITICAL
LOG_FORMAT: JSON (production) or text (development)
PERFORMANCE_THRESHOLD_MS: Slow operation warning threshold (default: 2000ms)
DISABLE_LOGO_FETCHING: Disable logo fetching entirely (default: false)
LOGO_TIMEOUT_SECONDS: Logo request timeout (default: 1s)
LOGO_CIRCUIT_BREAKER_THRESHOLD: Failures before circuit breaker opens (default: 5)
LOGO_CIRCUIT_BREAKER_TIMEOUT: Circuit breaker timeout in seconds (default: 300)
Correlation Tracking: Every request gets a unique correlation ID for distributed tracing
Structured Logging: JSON format with contextual metadata
Performance Monitoring: Automatic slow operation detection and logging
External Library Filtering: Reduced noise from urllib3, curl_cffi, Redis
Specialized Logging Functions:
log_api_request/response() - HTTP request/response logging
log_route_request/success/error() - Route-specific logging
log_cache_operation() - Cache hit/miss tracking
log_external_api_call() - Third-party API monitoring
log_performance() - Operation timing with threshold warnings
log_critical_system_failure() - System-level error handling

Middleware Stack¶

LoggingMiddleware (`src/middleware/logging_middleware.py`)¶

Request Correlation: Sets unique correlation ID for each request
Comprehensive Logging: Logs all HTTP requests/responses with timing
Route-Aware Logging: Enhanced logging for API routes vs. static content
Performance Tracking: Automatic operation timing and slow request warnings
Error Context: Detailed error logging with stack traces and request context
Response Headers: Adds X-Correlation-ID header for client tracking

RateLimitMiddleware (`src/middleware/rate_limit_middleware.py`)¶

IP-Based Limiting: Configurable daily request limits per client IP
API Key Bypass: Admin keys bypass all rate limits
Open Paths: Documentation and health endpoints exempt from limits
Health Check Throttling: Special rate limiting for /health endpoint
Response Headers: Rate limit status in X-RateLimit-* headers

Service Layer Architecture¶

Service Organization (`src/services/`)¶

Services implement business logic with a consistent pattern:

src/services/
├── quotes/
│   ├── fetchers/
│   │   ├── quote_api.py      # Primary Yahoo Finance API
│   │   └── quote_scraper.py  # Fallback web scraping
│   ├── get_quotes.py         # Main service with retry logic
│   └── utils.py             # Helper functions
├── indicators/
│   └── core/                # Cython-compiled calculations
└── ...

Multi-Source Strategy¶

Each service implements dual-fetching for resilience:

Primary Source (Yahoo Finance API): Fast, reliable when available
Fallback Source (Web scraping): Resilient to API changes

The @retry decorator automatically falls back from API to scraping on failure.

Technical Indicators (`src/services/indicators/core/`)¶

High-performance numerical computations using Cython:

Cython Modules: moving_averages.pyx, oscillators.pyx, trends.pyx, utils.pyx
Build Requirement: python setup.py build_ext --inplace compiles .pyx → .so files
Performance: Significant speed improvements for mathematical calculations
NumPy Integration: Optimized array operations with NumPy C API

Development Workflow¶

Package Management¶

Primary: uv for fast dependency resolution and virtual environment management
Commands: uv sync --all-groups for development setup
Fallback: Traditional pip with pip install -e . from pyproject.toml

Build Process¶

Essential for technical indicators:

make build                              # Recommended
python setup.py build_ext --inplace   # Manual

Development Commands¶

make help        # Show all commands
make install-dev # Install dependencies + pre-commit hooks
make serve       # Start development server
make test        # Run tests with coverage
make lint        # Run pre-commit hooks (ruff check/format)
make docs        # Serve documentation
make clean       # Clean build artifacts

Data Persistence & Caching¶

Caching Strategy (`src/utils/cache.py`)¶

Flexible caching with market-aware expiration:

Cache Handlers:
RedisCacheHandler: Distributed caching (when Redis available)
MemCacheHandler: Local in-memory fallback
Smart Expiration:
Standard TTL during market hours
Extended TTL when markets closed
Market schedule awareness via MarketSchedule
Cache Key Generation: SHA-256 hash of function name + arguments
Type Preservation: Maintains Pydantic model types after retrieval

Retry Mechanism (`src/utils/retry.py`)¶

Sophisticated retry with fallback capability:

Configurable Retries: Attempts primary function with exponential backoff
Intelligent Fallback: Analyzes function signatures for parameter compatibility
Error Propagation: Preserves HTTP exceptions while retrying network errors

Authentication & HTTP Clients¶

Yahoo Authentication (`src/utils/yahoo_auth.py`)¶

Manages Yahoo Finance API access:

Cookie/Crumb Management: Handles CSRF tokens and session cookies
Automatic Refresh: Maintains valid authentication credentials
Consent Flow Handling: Navigates Yahoo's consent requirements
Thread Safety: Async lock prevents concurrent refresh attempts

HTTP Client Architecture¶

FetchClient (`src/clients/fetch_client.py`)¶

General-purpose async HTTP client: - curl_cffi Integration: Browser impersonation for web scraping - Proxy Support: Configurable proxy routing - Async/Await: Non-blocking operations via asyncio.to_thread

YahooFinanceClient (`src/clients/yahoo_client.py`)¶

Specialized Yahoo Finance API client: - Auto-Authentication: Injects cookies/crumb automatically - Error Handling: Yahoo-specific HTTP status code handling - API Methods: get_quote(), get_chart(), search(), etc.

Real-Time Data Architecture¶

WebSocket Connection Management¶

ConnectionManager (`src/connections/connection_manager.py`)¶

In-memory WebSocket management: - Channel-Based: Groups connections by symbol/topic - Task Management: Per-channel data fetching tasks - Auto-Cleanup: Removes inactive connections and cancels unused tasks

RedisConnectionManager (`src/connections/redis_connection_manager.py`)¶

Distributed WebSocket support: - Redis Pub/Sub: Multi-instance message broadcasting - Load Distribution: Data fetching tasks distributed across instances - Scalability: Supports horizontal scaling of WebSocket connections

Security & Rate Limiting¶

Security Configuration (`src/security/rate_limit_manager.py`)¶

Admin API Keys: ADMIN_API_KEY bypasses all rate limits
Daily Limits: Configurable requests per IP per day
Open Paths: /docs, /ping, /health exempt from limits
Health Check Throttling: Separate rate limiting for health checks
Security Toggle: USE_SECURITY environment variable enables/disables rate limiting

Implementation¶

In-Memory Storage: Rate limit counters with automatic cleanup
IP-Based Tracking: Client identification via request IP
Header Communication: Rate limit status via response headers

Dependency Injection (`src/utils/dependencies.py`)¶

FastAPI-injectable dependencies provide:

Shared Resources: Session objects, Redis clients, authentication
Client Abstractions: Pre-configured HTTP clients
Utility Functions: Reusable async operations
Context Management: Request-specific state and correlation

Deployment Architecture¶

AWS Lambda: Mangum adapter for serverless deployment
Docker: Multi-stage builds with optimized images
Environment Configuration: 12-factor app principles with comprehensive environment variables
Health Monitoring: Comprehensive health checks for all services
Logging Integration: Structured logs for observability platforms

Key Environment Variables¶

Variable	Purpose	Default	Required	Docker Config
`REDIS_URL`	Redis connection for caching/WebSockets	None	No	Runtime
`USE_SECURITY`	Enable rate limiting and API authentication	False	No	Runtime
`ADMIN_API_KEY`	Admin key bypassing rate limits	None	No	Runtime
`USE_PROXY`	Enable proxy for web scraping	False	No	Runtime
`PROXY_URL`	Proxy server URL	None	No	Runtime
`PROXY_TOKEN`	Proxy authentication token	None	No	Runtime
`LOG_LEVEL`	Logging level	INFO	No	Build + Runtime
`LOG_FORMAT`	Log format (json/text)	json	No	Build + Runtime
`BYPASS_CACHE`	Disable caching	False	No	Runtime
`ALGOLIA_APP_ID`	Algolia search app ID	Public default	No	Runtime
`ALGOLIA_API_KEY`	Algolia search API key	Public default	No	Runtime
`DISABLE_LOGO_FETCHING`	Disable logo fetching	false	No	Build + Runtime
`LOGO_TIMEOUT_SECONDS`	Logo request timeout	1	No	Build + Runtime
`LOGO_CIRCUIT_BREAKER_THRESHOLD`	Circuit breaker failure threshold	5	No	Build + Runtime
`LOGO_CIRCUIT_BREAKER_TIMEOUT`	Circuit breaker timeout (seconds)	300	No	Build + Runtime
`PERFORMANCE_THRESHOLD_MS`	Slow operation warning threshold	2000	No	Build + Runtime

Docker Configuration¶

Both Dockerfile and Dockerfile.aws support all environment variables:

Build-time: Logo fetching and logging configs can be baked into image
Runtime: All variables can be overridden when running containers
Compose Ready: All variables work with docker-compose configurations

FinanceQuery API Architecture¶

Overview¶

Core Components¶

FastAPI Application (src/main.py)¶

Data Models (src/models/)¶

Routing System (src/routes/)¶