Commit Graph

646 Commits

Author SHA1 Message Date
Ali Khokhar 25b329a3fc Update README
Removed duplicate VSCode Extension Setup instructions from README.md.
2026-03-01 05:30:30 -08:00
Alishahryar1 35a2760f6e Fixed encapsulation violations 2026-03-01 04:28:22 -08:00
Alishahryar1 302ee28585 Removed dead code 2026-03-01 04:21:06 -08:00
Alishahryar1 34757511a0 Improve deterministic error surfacing across stream and API 2026-03-01 01:32:52 -08:00
Alishahryar1 7f2612d2df Added optimization logging 2026-03-01 01:02:59 -08:00
Ali Khokhar aee9f0ad93 Add code review fix plan covering 11 issues across modularity, encapsulation, performance, and dead code (#62) 2026-03-01 00:45:33 -08:00
Alishahryar1 c54c57a742 style 2026-02-28 09:13:25 -08:00
Alishahryar1 744eec2772 Major cleanup with GLM-5 2026-02-28 09:10:21 -08:00
Mauro Druwel de70700dde feat: Use NVIDIA NIM ASR for audio transcription (#53)
## Summary
Added NVIDIA NIM as a second transcription option ( alongside local
Whisper). This lets you transcribe voice notes using NVIDIA's cloud API
instead of running Whisper locally.

## What changed

- **Transcription**: Now supports the two backends

  - Local Whisper: Free, runs on your GPU/CPU (existing)
  - NVIDIA NIM: Cloud API via Riva gRPC (new)

- **Supported models**: 8 NVIDIA NIM models added (Parakeet variants for
different languages, Whisper Large V3)

---------

Co-authored-by: Alishahryar1 <alishahryar2@gmail.com>
2026-02-28 08:48:59 -08:00
Alishahryar1 f1f6080224 Updated agent instructions and renamed lint check to format check 2026-02-28 07:20:00 -08:00
Alishahryar1 a74ec74271 Major refactor done with minimax m2.5 2026-02-28 04:36:29 -08:00
Alishahryar1 cfe43bf5be Updated README 2026-02-28 04:21:05 -08:00
Ali Khokhar 7d99b38b70 Update environment variable syntax in README 2026-02-28 04:04:56 -08:00
Alishahryar1 79a1ae0c54 minor refactor using minimax m2.5 2026-02-27 20:44:39 -08:00
Ali Khokhar f9e8226120 Clarify Docker integration acceptance in README
Updated README to clarify Docker integration status.
2026-02-27 20:00:57 -08:00
Ali Khokhar c4d8681000 Backup/before cleanup 20260222 230402 (#58) 2026-02-27 19:50:21 -08:00
Ali Khokhar e2840095ce Merge pull request #47 from Alishahryar1/cursor/readme-env-example-consistency-0722 2026-02-20 01:38:00 -08:00
Cursor Agent 5d5055f96f docs: update README for removed PROVIDER_TYPE, model prefix format
Co-authored-by: Ali Khokhar <alishahryar2@gmail.com>
2026-02-20 09:37:25 +00:00
Ali Khokhar 4c0c1f125b Update README.md 2026-02-20 01:33:57 -08:00
Alishahryar1 d6a0e1a401 Provider inferred from model name using prefix 2026-02-19 20:53:02 -08:00
Alishahryar1 21959b6189 lint 2026-02-19 20:40:05 -08:00
Alishahryar1 b6e602d058 removed deadcode 2026-02-19 20:39:38 -08:00
Alishahryar1 0c8d59e33e Removed deprecated modules and updated imports 2026-02-19 20:38:11 -08:00
Alishahryar1 2b0495dd08 moved text.py to common utils for providers 2026-02-19 20:32:45 -08:00
Alishahryar1 2ad64cc97a quoted string vars in env example 2026-02-19 20:27:28 -08:00
Alishahryar1 d21ed84171 updated uv version 2026-02-19 20:23:37 -08:00
Alishahryar1 2c1158f62f removed a test 2026-02-19 20:06:15 -08:00
Alishahryar1 aec4510a0a Upgraded python version 2026-02-19 20:02:44 -08:00
Alishahryar1 39cc39c341 standardized python version 2026-02-19 20:01:01 -08:00
Ali Khokhar 81a73f3349 Merge pull request #46 from rishiskhare/main 2026-02-19 10:17:26 -08:00
Rishi Khare 8ffe587a8f docs: rename model picker summary to Multi-Model Support (Model Picker)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-19 10:40:09 -05:00
Rishi Khare a5496346ca docs: clarify claude-pick avoids needing to edit MODEL in .env
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-19 10:16:22 -05:00
Rishi Khare 39ad80f6e6 docs: mention source ~/.bashrc as alternative to ~/.zshrc in model picker
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-19 10:00:43 -05:00
Rishi Khare 5c6d8e150e docs: move model picker to summary within getting started and add demo video
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-19 09:58:55 -05:00
Ali Khokhar 00e6419881 Merge pull request #45 from Alishahryar1/claude/queue-concurrency-limits-EFnHT 2026-02-19 06:41:11 -08:00
Claude 45b7e4cafd Make PROVIDER_MAX_CONCURRENCY required with default of 5
- `max_concurrency` is now always an `int` (default 5) — `None`/unlimited
  is no longer a valid state; omitting the env var uses the default
- `GlobalRateLimiter`: semaphore is always created; `concurrency_slot()`
  no longer has None guards; log message always includes concurrency
- `ProviderConfig.max_concurrency`: `int = 5` (was `int | None = None`)
- `Settings.provider_max_concurrency`: `int = Field(default=5, ...)` —
  setting env var to an invalid value (e.g. empty string) raises
- `.env.example`: uncommented `PROVIDER_MAX_CONCURRENCY=5`
- README: updated config table default from `—` to `5`
- Tests: removed `test_concurrency_slot_noop_when_not_configured`;
  updated mock settings to use `5` instead of `None`

https://claude.ai/code/session_014mrF1WMNgmNjtPBuoQHsbg
2026-02-19 14:39:42 +00:00
Claude 41fd316c76 Update README for provider concurrency and removal of MAX_CLI_SESSIONS
- Config table: add PROVIDER_MAX_CONCURRENCY, remove MAX_CLI_SESSIONS
- Discord Bot capabilities: replace "Up to 10 concurrent" with "Unlimited concurrent... (controlled by PROVIDER_MAX_CONCURRENCY)"
- Features table: note optional concurrency cap in Smart Rate Limiting row

https://claude.ai/code/session_014mrF1WMNgmNjtPBuoQHsbg
2026-02-19 14:34:15 +00:00
Claude 99f99fce90 Remove max_cli_sessions — CLI session pool is now unbounded
The max_sessions cap in CLISessionManager was the only thing enforcing
a limit on concurrent CLI processes. Now that provider concurrency is
controlled at the streaming layer (PROVIDER_MAX_CONCURRENCY semaphore),
the CLI session pool cap is redundant and removed entirely.

Changes:
- cli/manager.py: remove max_sessions param, cap check, _cleanup_idle_sessions_unlocked, max_sessions from get_stats()
- config/settings.py: remove max_cli_sessions field
- api/app.py: remove max_sessions=settings.max_cli_sessions from CLISessionManager constructor
- messaging/handler.py: remove "Waiting for slot" status check; stats display no longer shows Max CLI
- .env.example: remove MAX_CLI_SESSIONS line
- tests/cli/test_cli.py: remove max_sessions args and assertion from manager tests
- tests/cli/test_cli_manager_edge_cases.py: remove two tests for cap/cleanup behavior
- tests/api/test_app_lifespan_and_errors.py: remove max_cli_sessions from all SimpleNamespace settings
- tests/config/test_config.py: remove max_cli_sessions isinstance assertion
- tests/conftest.py: remove max_sessions from mock stats
- tests/messaging/test_handler.py: merge slot/capacity tests into single new-conversation test; remove Max CLI assertion from stats test
- tests/messaging/test_handler_markdown_and_status_edges.py: remove "Waiting for slot" assertion; drop max_sessions from all stats mocks

https://claude.ai/code/session_014mrF1WMNgmNjtPBuoQHsbg
2026-02-19 14:31:47 +00:00
Claude afaf50a972 Add queue-level concurrency limit to provider streaming
Adds max_concurrency cap to GlobalRateLimiter using asyncio.Semaphore.
A request now waits for a concurrency slot before the sliding window rate
limit check, so at most N streams are open to the provider simultaneously,
even when the rate window would allow more.

Changes:
- providers/rate_limit.py: max_concurrency param, _concurrency_sem, concurrency_slot() asynccontextmanager
- providers/openai_compat.py: pass max_concurrency to limiter; wrap execute_with_retry + stream iteration in concurrency_slot()
- providers/base.py: max_concurrency field on ProviderConfig
- config/settings.py: provider_max_concurrency setting (PROVIDER_MAX_CONCURRENCY env var, default None = unlimited)
- api/dependencies.py: pass provider_max_concurrency into all three provider ProviderConfig instantiations
- .env.example: document PROVIDER_MAX_CONCURRENCY (commented out)
- tests/providers/test_provider_rate_limit.py: 5 new tests covering concurrency limit enforcement, slot release on exception, noop when unconfigured
- tests/api/test_dependencies.py: add provider_max_concurrency=None to mock settings helper

https://claude.ai/code/session_014mrF1WMNgmNjtPBuoQHsbg
2026-02-19 14:23:21 +00:00
Alishahryar1 4aff0b910f provider type quoted 2026-02-18 19:54:30 -08:00
Alishahryar1 cf1284b784 default voice note enabled set to false 2026-02-18 19:54:13 -08:00
Alishahryar1 416f259c8b Reordered env example 2026-02-18 19:53:30 -08:00
Alishahryar1 c35ecba9d8 Update Whisper model configuration to use 'base' as the default model ID 2026-02-18 19:36:58 -08:00
Alishahryar1 9a18e4f1d8 Removed plan.md 2026-02-18 18:51:40 -08:00
Ali Khokhar 4984fffffa Merge pull request #43 from suryawanshishantanu6/feature/fix-input-token 2026-02-18 18:43:14 -08:00
Shantanu Suryawanshi 3d81f44803 Merge branch 'main' into feature/fix-input-token 2026-02-18 21:41:26 -05:00
Ali Khokhar 889556c2f9 Merge pull request #42 from rishiskhare/model-picker 2026-02-18 18:38:41 -08:00
Shantanu Suryawanshi 24a5e4d968 Fixing sse stream 2026-02-18 21:31:28 -05:00
Alishahryar1 e7ac85264f Improved optimizations to decrease llm calls further and increase throughput 2026-02-18 17:54:41 -08:00
Alishahryar1 593fb55954 Added fix for large replies being truncated entirely leaving no response text 2026-02-18 17:39:38 -08:00