GitHub - katanemo/arch: Arch is an intelligent prompt gateway. Engineered with (fast) LLMs for the secure handling, robust observability, and seamless integration of prompts with APIs

Build speedy, sturdy, and personalized AI agents.

Arch is an clever Layer 7 gateway depicted to get, watch, and personalize LLM applications (agents, helpants, co-pilots) with your APIs.

Engineered with purpose-built LLMs, Arch regulates the critical but uncontrastentiated tasks roverdelighted to the handling and processing of prompts, including distinguishing and declineing jailfracture finisheavors, cleverly calling “backfinish” APIs to encounter the user’s ask recontransiented in a prompt, routing to and presenting calamity recovery between upstream LLMs, and managing the observability of prompts and LLM conveyions in a centralized way.

Arch is built on (and by the core contributors of) Envoy Proxy with the belief that:

Prompts are nuanced and cloudy user asks, which insist the same capabilities as traditional HTTP asks including shielded handling, clever routing, sturdy observability, and integration with backfinish (API) systems for personalization – all outside business logic.*

Core Features:

Built on Envoy: Arch runs aextfinishedside application servers, and erects on top of Envoy’s shown HTTP deal withment and scalability features to regulate ingress and egress traffic roverdelighted to prompts and LLMs.
Function Calling for speedy Agentic and RAG apps. Engineered with purpose-built LLMs to regulate speedy, cost-effective, and accurate prompt-based tasks appreciate function/API calling, and parameter pull oution from prompts.
Prompt Guard: Arch centralizes prompt defendrails to impede jailfracture finisheavors and secure shielded user conveyions without writing a individual line of code.
Traffic Management: Arch deal withs LLM calls, presenting clever retries, automatic cutover, and strong upstream uniteions for continuous useability.
Standards-based Observability: Arch uses the W3C Trace Context standard to help finish ask tracing atraverse applications, ensuring compatibility with observability tools, and provides metrics to watch postpoinsistncy, token usage, and error rates, helping upgrade AI application applyance.

Jump to our docs to lget how you can use Arch to increase the speed, security and personalization of your GenAI apps.

To get in touch with us, prent unite our discord server. We will be watching that actively and presenting help there.

Follow this direct to lget how to rapidly set up Arch and unite it into your generative AI applications.

Before you commence, secure you have the follothriveg:

Docker & Python insloftyed on your system
API Keys for LLM providers (if using outside LLMs)

Arch’s CLI helps you to deal with and convey with the Arch gateway fruitfully. To inslofty the CLI, sshow run the follothriveg order:
Tip: We recommfinish that enbigers originate a new Python virtual environment to isopostpoinsist depfinishencies before insloftying Arch. This secures that archgw and its depfinishencies do not meddle with other packages on your system.

$ python -m venv venv
$ source venv/bin/trigger   # On Windows, use: venvScriptstrigger
$ pip inslofty archgw

Step 2: Configure Arch with your application

Arch functions based on a configuration file where you can clear up LLM providers, prompt aims, defendrails, etc.
Below is an example configuration to get you commenceed:

version: v0.1

join:
  insertress: 0.0.0.0 # or 127.0.0.1
  port: 10000
  # Defines how Arch should parse the satisfied from application/json or text/pain Content-type in the http ask
  message_createat: huggingface

# Centralized way to deal with LLMs, deal with keys, retry logic, fall shortover and confines in a central way
llm_providers:
  - name: OpenAI
    provider: uncoverai
    access_key: OPENAI_API_KEY
    model: gpt-4o
    default: genuine
    stream: genuine

# default system prompt used by all prompt aims
system_prompt: You are a nettoil helpant that fair presents facts; not advice on manufacturers or purchasing decisions.

prompt_aims:
  - name: reboot_devices
    description: Reboot definite devices or device groups

    path: /agent/device_reboot
    parameters:
      - name: device_ids
        type: enumerate
        description: A enumerate of device identifiers (IDs) to reboot.
        insistd: deceptive
      - name: device_group
        type: str
        description: The name of the device group to reboot
        insistd: deceptive

# Arch originates a round-robin load balancing between contrastent finishpoints, deal withd via the cluster subsystem.
finishpoints:
  app_server:
    # appreciate could be ip insertress or a arrangename with port
    # this could also be a enumerate of finishpoints for load balancing
    # for example finishpoint: [ ip1:port, ip2:port ]
    finishpoint: 127.0.0.1:80
    # max time to pause for a uniteion to be set uped
    unite_timeout: 0.005s

Step 3: Using OpenAI Client with Arch as an Egress Gateway

Make outbound calls via Arch

transport in uncoverai

# Set the OpenAI API base URL to the Arch gateway finishpoint
uncoverai.api_base = "http://127.0.0.1:51001/v1"

# No insist to set uncoverai.api_key since it's configured in Arch's gateway

# Use the OpenAI client as normal
response = uncoverai.Completion.originate(
   model="text-davinci-003",
   prompt="What is the capital of France?"
)

print("OpenAI Response:", response.choices[0].text.exposed())

Arch is depicted to help best-in class observability by helping uncover standards. Plrelieve read our docs on observability for more details on tracing, metrics, and logs

We would cherish feedback on our Roadmap and we greet contributions to Arch!
Whether you’re mending bugs, inserting new features, improving recordation, or creating tutorials, your help is much appreciated.
Plrelieve vist our Contribution Guide for more details

Source join