# Providers And Authentication ## Default Behavior `--provider` selects the credential namespace and provider-specific request behavior. If omitted, the default provider is `openai`. Credential lookup order is: 1. Explicit CLI overrides via `--api_key_var` and `--api_base_var` 2. Provider-aware defaults inferred from the provider slug 3. `.env` 4. Existing shell environment Examples: - `OPENAI_API_KEY` - `REQUESTY_API_KEY` - `REQUESTY_BASE_URL` - `CUSTOMROUTER_ACCESS_TOKEN` - `CUSTOMROUTER_BASE_URL` ## Request Controls Use only the controls supported by your target provider: - `--reasoning_effort {low,medium,high,xhigh}` for OpenAI-style reasoning - `--verbosity {low,medium,high}` for GPT output detail - `--thinking_level {minimal,low,medium,high}` for Gemini thinking configuration - `--effort {low,medium,high,max}` for Claude-style effort - `--prompt_cache_key` for OpenAI-style cache routing - `--gemini_cached_content` or `--create_gemini_cache` for Gemini context caching - `--requesty_auto_cache` for Requesty automatic caching - `--strict_control_acceptance` to fail when requested controls are dropped For prompt-caching experiments, start with `--prompt_layout compact`. ## System Prompt Encoding Use `--system_prompt` for normal single-line prompts. For multiline or shell-sensitive prompts, use `--system_prompt_b64`. The browser GUI switches automatically: - one-line prompt -> `--system_prompt` - multiline prompt -> `--system_prompt_b64` ## Vertex For `--provider vertex`, authentication is OAuth access-token based rather than static API-key based. Relevant environment variables: - `VERTEX_ACCESS_TOKEN_COMMAND` - `VERTEX_ACCESS_TOKEN` - `VERTEX_BASE_URL` - `VERTEX_MODELS_BASE_URL` - `VERTEX_ACCESS_TOKEN_REFRESH_SECONDS` - `VERTEX_AUTO_ADC_LOGIN` - `VERTEX_ADC_LOGIN_COMMAND` ### Browser-Capable Workstation Recommended setup: ```bash VERTEX_BASE_URL=https://us-central1-aiplatform.googleapis.com/v1/projects//locations/us-central1/endpoints/openapi VERTEX_AUTO_ADC_LOGIN=true VERTEX_ACCESS_TOKEN_COMMAND=gcloud auth application-default print-access-token ``` - If ADC is missing, the agent can trigger `gcloud auth application-default login`. - If Vertex reports a quota-project 403, set it once with `gcloud auth application-default set-quota-project `. - The same auth flow is used for normal benchmark runs and `--update-models`. ### CLI-Only Or Remote Environments Disable browser auto-login: ```bash VERTEX_AUTO_ADC_LOGIN=false ``` Then use one of: - a non-interactive `VERTEX_ACCESS_TOKEN_COMMAND` - pre-provisioned ADC on the machine or runtime identity - a manually provided short-lived `VERTEX_ACCESS_TOKEN` CLI equivalents: - `--no-vertex_auto_adc_login` - `--vertex_access_token_refresh_seconds N` ### Model Catalog Fallback When `--update-models` is used, the agent first tries `/models`, where `` is: - `VERTEX_MODELS_BASE_URL` when set - otherwise `VERTEX_BASE_URL` If `/models` returns 404, the agent falls back to: `https://aiplatform.googleapis.com/v1beta1/publishers/google/models` This allows separate runtime and catalog endpoints. ## Updating `config_models.js` Regenerate the model catalog consumed by the GUI: ```bash python benchmark_agent.py --update-models ``` Useful flags: - `--models-providers` to limit the provider list - `--models-output` to change the output path ## Custom Providers Custom OpenAI-compatible providers do not need hardcoded support as long as you provide: - a provider slug via `--provider` - credentials in `_API_KEY` or `_ACCESS_TOKEN` - a base URL in `_BASE_URL` Example: ```bash python benchmark_agent.py \ --provider myrouter \ --model some-model ``` with environment variables: ```bash MYROUTER_API_KEY=... MYROUTER_BASE_URL=https://example.com/v1 ```