Commit Graph

58 Commits

Author SHA1 Message Date
Douwe Osinga 6ae72a6cf0 Use haiku for databricks (#6943)
Co-authored-by: Douwe Osinga <douwe@squareup.com>
2026-02-04 17:17:59 +00:00
jh-block 5492052f0c Allow copying long strings to the clipboard in the diagnostics viewer (#6951) 2026-02-04 18:17:50 +01:00
Elias Posen 8631caa890 Use Port of Context (pctx) for code mode (#6765)
Signed-off-by: Elias Posen <elias@posen.ch>
Signed-off-by: Adrian Cole <adrian@tetrate.io>
Co-authored-by: Adrian Cole <adrian@tetrate.io>
2026-02-03 12:15:49 -05:00
Jack Amadeo eae5a47788 More providers for testing (#6849) 2026-02-03 16:21:59 +00:00
Jack Amadeo a398a77682 allow skipping providers in test_providers.sh (#6778) 2026-01-28 13:30:26 -05:00
Douwe Osinga d7ead8d980 Improve mcp test (#6671)
Co-authored-by: Douwe Osinga <douwe@squareup.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-01-28 13:04:54 -05:00
Douwe Osinga 5f82efebd8 Add diagnostics viewer (#6770)
Co-authored-by: Douwe Osinga <douwe@squareup.com>
2026-01-28 10:37:25 -05:00
Michael Neale e78a1e7d4e feat: codex subscription support (#6600)
Signed-off-by: Adrian Cole <adrian@tetrate.io>
Co-authored-by: Adrian Cole <adrian@tetrate.io>
2026-01-23 17:11:58 +11:00
Zane e7bfdf8fa2 smoke test allow pass for flaky providers (#6638) 2026-01-22 16:10:41 -08:00
dependabot[bot] f9cc87dc38 chore(deps): bump aiohttp from 3.13.0 to 3.13.3 in /scripts/provider-error-proxy (#6539)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-01-16 12:47:45 -05:00
dependabot[bot] e202c46969 chore(deps): bump brotli from 1.1.0 to 1.2.0 in /scripts/provider-error-proxy (#6538)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-01-16 12:41:29 -05:00
Lifei Zhou 7c3f41e977 fixed test compilation on main branch (#6512) 2026-01-15 08:48:54 -06:00
Lifei Zhou 637049c967 Fix: exclude platform_schedule_tool in CLI (#6442) 2026-01-13 11:35:58 +11:00
Bradley Axen 38f5f338cb fix: improve smoke test prompt for reliable tool calling (#6281)
Co-authored-by: Michael Neale <michael.neale@gmail.com>
2025-12-31 15:52:37 -08:00
Michael Neale 8ec6332738 fix: adding more open models (#6300) 2025-12-31 09:48:05 +11:00
Michael Neale 5ca7eb2305 chore: Update gemini versions in test_providers.sh (#6246) 2025-12-23 11:12:19 +11:00
Alex Hancock 7134e89c4b feat: improved UX for tool calls via execute_code (#6205) 2025-12-22 10:42:20 -05:00
Michael Neale d4814042e6 chore: cover code mode with end to end provider tests (#6183) 2025-12-19 12:02:06 +08:00
Jack Amadeo 7ff3adcc5f Clean PR preview sites from gh-pages branch history (#6161) 2025-12-18 16:22:57 -05:00
Jack Amadeo 9fdb0356f0 Disallow subagents with no extensions (#5825) 2025-12-15 12:45:42 -05:00
tlongwell-block a131b08817 refactor: unify subagent and subrecipe tools into single tool (#5893) 2025-12-13 13:50:20 -05:00
Michael Neale 7dd244eff6 chore: avoid accidentally using native tls again (#6086) 2025-12-12 11:35:52 +11:00
Douwe Osinga 5f50198318 feat: @goose in terminal (native terminal support) (#5887)
Co-authored-by: Bradley Axen <baxen@squareup.com>
Co-authored-by: Michael Neale <michael.neale@gmail.com>
Co-authored-by: Douwe Osinga <douwe@squareup.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2025-12-01 17:40:17 +11:00
David Katz c1c772b267 Add out of context compaction test via error proxy (#5805) 2025-11-21 14:51:01 -05:00
Douwe Osinga f4724cbf23 Comment out the flaky mcp callers (#5827)
Co-authored-by: Douwe Osinga <douwe@squareup.com>
2025-11-20 21:21:38 +01:00
Salvatore Testa cfdf01567d fix: support Gemini 3's thought signatures (#5806)
Signed-off-by: Salvatore Testa <sal@withpersona.com>
2025-11-20 16:28:27 +11:00
David Katz 1d8d6a1788 Provider error proxy for simulating various types of errors (#5091) 2025-11-18 17:28:07 -05:00
Michael Neale 2bef034303 feat: trying grok for live test (#5732) 2025-11-17 09:37:43 +11:00
Jack Amadeo d4f66f4855 faster, cheaper (pick two): improve CI workflow and switch to free github runner (#5702)
Co-authored-by: Douwe Osinga <douwe@block.xyz>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2025-11-14 12:58:57 -05:00
Jack Amadeo 5110d32142 bump openapi version directly (#5674) 2025-11-11 10:15:42 -05:00
Alex Hancock 7ec3b84ad7 fix: gemini flash -> pro for mcp smoke tests (#5574) 2025-11-06 10:05:18 -05:00
David Katz eb29083a52 Manual compaction test and fix (#5568) 2025-11-06 10:03:48 -05:00
Zane 89f7384d57 add clippy warning for string_slice (#5422)
Co-authored-by: Douwe Osinga <douwe@squareup.com>
2025-11-04 17:46:25 -05:00
Michael Neale 7511a533d6 we should run this on main and also test open models at least via ope… (#5556)
adds qwen3-code and GLM 4.6 to test_providers for open model coverage
2025-11-04 09:06:23 +11:00
Alex Hancock 38e7dc8f30 fix: remove qwen3-coder from provider/mcp smoke tests (#5551) 2025-11-03 14:33:49 -05:00
Alex Hancock c1c13716e0 chore(tests/mcp): testing for MCP sampling (#5456) 2025-11-03 12:23:11 -05:00
Amed Rodriguez d9633ff1d9 Change Recipes Test Script (#5457) 2025-10-30 16:00:25 -07:00
Michael Neale b94535b679 testing tetrate with sonnet (#5428) 2025-10-29 11:40:02 +11:00
Amed Rodriguez 4687656487 Add Recipes Test Script (#5420) 2025-10-28 17:17:51 -07:00
Douwe Osinga 6b6c50976c Gemini again (#5390)
Co-authored-by: Douwe Osinga <douwe@squareup.com>
2025-10-27 16:41:00 -04:00
Will Pfleger 044b227fdb (re)Standardize Session Name Attribute (#5279) 2025-10-24 13:34:08 -04:00
Michael Neale 3c975bb358 live testing script (#5263)
Co-authored-by: Jack Amadeo <jackamadeo@squareup.com>
2025-10-21 16:39:58 +11:00
Douwe Osinga 64b37339e0 Skip subagents for gemini (#5257)
Co-authored-by: Douwe Osinga <douwe@squareup.com>
2025-10-18 17:35:29 -04:00
Michael Neale 890393bb68 Revert "Standardize Session Name Attribute" (#5250) 2025-10-18 12:44:30 -04:00
Will Pfleger b8c3508178 Standardize Session Name Attribute (#5085) 2025-10-17 17:05:41 -04:00
Jack Amadeo 757ceb6109 chore: turn clippy on for test code (#4817) 2025-09-26 00:06:07 -04:00
Angie Jones 63f3669cf7 Remove deprecated Claude 3.5 models (#4590) 2025-09-10 14:41:02 -05:00
Jack Amadeo 7c2b40cc21 Clean up langfuse docs and scripts (#4220) 2025-08-20 10:46:31 -04:00
Jack Amadeo dd504741a3 Remove cognitive complexity clippy lint (#4010) 2025-08-11 20:24:37 -04:00
Michael Neale 8f54fa84a5 fix: optimise reading large file content (#3767) 2025-08-06 09:38:52 +10:00