Using bia-bob with azure hosted OpenAI dodel #236

sebi06 · 2025-03-25T07:19:27Z

I tried to test you really cool "bia-bob" a bit internally, but so far I just cannot get it to work. And I think it might be due to our internal environment.

when using our internal endpoint & key I can list all our models (gpt's and others), for example: gpt-4o-2024-08-06

...
# that code works fine
client = AzureOpenAI(
                api_version=os.environ["AZURE_OPENAI_API_VERSION"],
                azure_endpoint=os.environ["AZURE_OPENAI_ENDPOINT"],
                api_key=os.environ["AZURE_OPENAI_API_KEY"],
            )
...

later in my notebook I tried: bob.initialize(endpoint="azure", model="gpt-4o-2024-08-06") and also using the endpoint directly, but always get errors. I tried to modify "_utilities.py" because I think our internal endpoint confuses the code because:

...
elif "gpt-" in model:
        full_response = generate_response_from_openai(
            model,
            system_prompt,
            user_prompt,
            chat_history,
            image,
            vision_model=Context.vision_model,
            vision_system_prompt=vision_system_prompt,
        )

get called but we are using gpt-models, but not hosted by open AI. I then leads to: NotFoundError: Error code: 404 - {'error': {'code': '404', 'message': 'Resource not found'}}

Our endpoint looks like: https://xyz.openai.azure.com/

Do you have a hint what I need to modify to make this work?

The text was updated successfully, but these errors were encountered:

haesleinhuepf · 2025-03-26T09:17:46Z

Hi @sebi06 ,

thanks for reaching out! I never tested an azure-endpoint that is not github-models. So this might be indeed buggy. I presume we could make it to work by modifying the correct_endpoint function in src/bia_bob/_utilities.py. Let's see if the AI agrees:

git-bob comment on how to do this.

github-actions · 2025-03-26T09:19:01Z

^{This message was generated by git-bob (version: 0.20.0, model: gpt-4o-2024-08-06, log), an experimental AI-based assistant. It can make mistakes and has limitations. Check its messages carefully.}

To adapt bia-bob for using Azure-hosted OpenAI models, you might need to modify the generate_response function in src/bia_bob/_utilities.py. Since your endpoint is Azure, ensure that the generate_response_from_azure function is used properly. You can adjust the conditional logic like shown below:

def generate_response(chat_history, image, model, system_prompt, user_prompt, vision_system_prompt):
    from ._machinery import Context
    from .endpoints._azure import generate_response_from_azure

    if Context.endpoint == "azure" and "gpt-" in model:
        full_response = generate_response_from_azure(
            model, system_prompt, user_prompt, chat_history, image,
            base_url=Context.endpoint, api_key=Context.api_key,
            vision_model=Context.vision_model,
            vision_system_prompt=vision_system_prompt
        )
    # rest of your conditions...
    return full_response

In this modification, it ensures that Azure-based gpt- models trigger generate_response_from_azure specifically using Azure configurations. Make sure other parts of the code correctly handle your Azure credentials, which you seem to have configured already.

haesleinhuepf · 2025-03-26T09:28:53Z

Hi @sebi06 ,

ok, the AI gives wrong advice here. And I checked the example here which should actually work for you too. I just tested this on my machine and it works (using github models though):

import os
from bia_bob import bob
bob.initialize(endpoint="https://models.inference.ai.azure.com", 
               api_key=os.environ.get('AZURE_API_KEY'),
               model="gpt-4o")

Would you mind testing this with your url and api key? If it does not work, could you please copy the entire error message here for debugging?

Thanks!

Best,
Robert

sebi06 · 2025-03-28T09:11:26Z

Hi @haesleinhuepf

this step seems to work fine (using our endpoint, which also ends with *.azure.com ) - I see no errors etc.

import os
from bia_bob import bob
bob.initialize(endpoint="https://xyz.azure.com", 
               api_key=os.environ.get('AZURE_API_KEY'),
               model="gpt-4o")

But when I try:

%%bob
What in the ZEN software?

I get the error:

---------------------------------------------------------------------------
NotFoundError                             Traceback (most recent call last)
Cell In[7], line 1
----> 1 get_ipython().run_cell_magic('bob', '', "What's the mission of ScaDS.AI's living lab?\n")

File f:\Documents\anaconda3\envs\prompt-env\Lib\site-packages\IPython\core\interactiveshell.py:2542, in InteractiveShell.run_cell_magic(self, magic_name, line, cell)
   2540 with self.builtin_trap:
   2541     args = (magic_arg_s, cell)
-> 2542     result = fn(*args, **kwargs)
   2544
8000
 # The code below prevents the output from being displayed
   2545 # when using magics with decorator @output_can_be_silenced
   2546 # when the last Python token in the expression is a ';'.
   2547 if getattr(fn, magic.MAGIC_OUTPUT_CAN_BE_SILENCED, False):

File F:\Github\bia-bob\src\bia_bob\_machinery.py:171, in bob(line, cell)
    154 supported_file_types_for_generation = [".md", ".txt", ".csv", ".yml", ".yaml", ".json", ".py"]
    156 task_selection_prompt = f"""
    157 Given the following prompt, decide which of the following types of tasks we need to perform:
    158 {TASK_TYPE_CODE_GENERATION}. Code generation: The prompt asks for code to be generated.
   (...)    169 Now, write the number of the task type into the next cell. Print the number only.
    170 """
--> 171 response = ask_llm(task_selection_prompt)
    173 try:
    174     task_type = int(response.strip().strip("\n").split(".")[0])

File F:\Github\bia-bob\src\bia_bob\_utilities.py:13, in ask_llm(prompt, image, chat_history)
     10 if chat_history is None:
     11     chat_history = []
---> 13 return generate_response(chat_history=chat_history,
     14                   image=image,
     15                   model=Context.model,
     16                   system_prompt="",
     17                   user_prompt=prompt,
     18                   vision_system_prompt="")

File F:\Github\bia-bob\src\bia_bob\_utilities.py:99, in generate_response(chat_history, image, model, system_prompt, user_prompt, vision_system_prompt)
     93     full_response = generate_response_from_mistral(model, system_prompt, user_prompt, chat_history, image,
     94                                                   base_url=Context.endpoint, api_key=Context.api_key,
     95                                                   vision_model=Context.vision_model,
     96                                                   vision_system_prompt=vision_system_prompt)
     98 elif Context.endpoint is not None:
---> 99     full_response = generate_response_from_openai(model, system_prompt, user_prompt, chat_history, image,
    100                                                   base_url=Context.endpoint, api_key=Context.api_key,
    101                                                   vision_model=Context.vision_model,
    102                                                   vision_system_prompt=vision_system_prompt)
    103 elif "gpt-" in model:
    104     full_response = generate_response_from_openai(model, system_prompt, user_prompt, chat_history, image,
    105                                                   vision_model=Context.vision_model,
    106                                                   vision_system_prompt=vision_system_prompt)

File F:\Github\bia-bob\src\bia_bob\endpoints\_openai.py:60, in generate_response_from_openai(model, system_prompt, user_prompt, chat_history, image, base_url, api_key, vision_model, vision_system_prompt)
     57     for i, m in enumerate(messages):
     58         print(f"\n\nMESSAGE {i}: {m}")
---> 60 response = client.chat.completions.create(
     61     messages=messages,
     62     model=model,
     63     **kwargs
     64 )  # stream=True would be nice
     65 reply = response.choices[0].message.content
     67 # store question and answer in chat history

File f:\Documents\anaconda3\envs\prompt-env\Lib\site-packages\openai\_utils\_utils.py:279, in required_args.<locals>.inner.<locals>.wrapper(*args, **kwargs)
    277             msg = f"Missing required argument: {quote(missing[0])}"
    278     raise TypeError(msg)
--> 279 return func(*args, **kwargs)

File f:\Documents\anaconda3\envs\prompt-env\Lib\site-packages\openai\resources\chat\completions\completions.py:914, in Completions.create(self, messages, model, audio, frequency_penalty, function_call, functions, logit_bias, logprobs, max_completion_tokens, max_tokens, metadata, modalities, n, parallel_tool_calls, prediction, presence_penalty, reasoning_effort, response_format, seed, service_tier, stop, store, stream, stream_options, temperature, tool_choice, tools, top_logprobs, top_p, user, web_search_options, extra_headers, extra_query, extra_body, timeout)
    871 @required_args(["messages", "model"], ["messages", "model", "stream"])
    872 def create(
    873     self,
   (...)    911     timeout: float | httpx.Timeout | None | NotGiven = NOT_GIVEN,
    912 ) -> ChatCompletion | Stream[ChatCompletionChunk]:
    913     validate_response_format(response_format)
--> 914     return self._post(
    915         "/chat/completions",
    916         body=maybe_transform(
    917             {
    918                 "messages": messages,
    919                 "model": model,
    920                 "audio": audio,
    921                 "frequency_penalty": frequency_penalty,
    922                 "function_call": function_call,
    923                 "functions": functions,
    924                 "logit_bias": logit_bias,
    925                 "logprobs": logprobs,
    926                 "max_completion_tokens": max_completion_tokens,
    927                 "max_tokens": max_tokens,
    928                 "metadata": metadata,
    929                 "modalities": modalities,
    930                 "n": n,
    931                 "parallel_tool_calls": parallel_tool_calls,
    932                 "prediction": prediction,
    933                 "presence_penalty": presence_penalty,
    934                 "reasoning_effort": reasoning_effort,
    935                 "response_format": response_format,
    936                 "seed": seed,
    937                 "service_tier": service_tier,
    938                 "stop": stop,
    939                 "store": store,
    940                 "stream": stream,
    941                 "stream_options": stream_options,
    942                 "temperature": temperature,
    943                 "tool_choice": tool_choice,
    944                 "tools": tools,
    945                 "top_logprobs": top_logprobs,
    946                 "top_p": top_p,
    947                 "user": user,
    948                 "web_search_options": web_search_options,
    949             },
    950             completion_create_params.CompletionCreateParams,
    951         ),
    952         options=make_request_options(
    953             extra_headers=extra_headers, extra_query=extra_query, extra_body=extra_body, timeout=timeout
    954         ),
    955         cast_to=ChatCompletion,
    956         stream=stream or False,
    957         stream_cls=Stream[ChatCompletionChunk],
    958     )

File f:\Documents\anaconda3\envs\prompt-env\Lib\site-packages\openai\_base_client.py:1242, in SyncAPIClient.post(self, path, cast_to, body, options, files, stream, stream_cls)
   1228 def post(
   1229     self,
   1230     path: str,
   (...)   1237     stream_cls: type[_StreamT] | None = None,
   1238 ) -> ResponseT | _StreamT:
   1239     opts = FinalRequestOptions.construct(
   1240         method="post", url=path, json_data=body, files=to_httpx_files(files), **options
   1241     )
-> 1242     return cast(ResponseT, self.request(cast_to, opts, stream=stream, stream_cls=stream_cls))

File f:\Documents\anaconda3\envs\prompt-env\Lib\site-packages\openai\_base_client.py:919, in SyncAPIClient.request(self, cast_to, options, remaining_retries, stream, stream_cls)
    916 else:
    917     retries_taken = 0
--> 919 return self._request(
    920     cast_to=cast_to,
    921     options=options,
    922     stream=stream,
    923     stream_cls=stream_cls,
    924     retries_taken=retries_taken,
    925 )

File f:\Documents\anaconda3\envs\prompt-env\Lib\site-packages\openai\_base_client.py:1023, in SyncAPIClient._request(self, cast_to, options, retries_taken, stream, stream_cls)
   1020         err.response.read()
   1022     log.debug("Re-raising status error")
-> 1023     raise self._make_status_error_from_response(err.response) from None
   1025 return self._process_response(
   1026     cast_to=cast_to,
   1027     options=options,
   (...)   1031     retries_taken=retries_taken,
   1032 )

NotFoundError: Error code: 404 - {'error': {'code': '404', 'message': 'Resource not found'}}

haesleinhuepf · 2025-03-28T09:49:48Z

Can you please carefully check if os.environ["AZURE_OPENAI_ENDPOINT"] ends e.g. like this:

https://xyz.azure.com/api
https://xyz.azure.com/v1

And may I ask what the value of os.environ["AZURE_OPENAI_API_VERSION"] is?

Also from your code snippet on the very top:

# that code works fine
client = AzureOpenAI(
                api_version=os.environ["AZURE_OPENAI_API_VERSION"],
                azure_endpoint=os.environ["AZURE_OPENAI_ENDPOINT"],
                api_key=os.environ["AZURE_OPENAI_API_KEY"],
            )

There are import statments missing. What library are you using for this?

Does this work too? (source)

import os
import openai

client = openai.OpenAI(
                       base_url=os.environ["AZURE_OPENAI_ENDPOINT"],
                       api_key=os.environ["AZURE_OPENAI_API_KEY"])

print("\n".join([model.id for model in client.models.list().data]))

sebi06 · 2025-03-28T10:27:57Z

Hi @haesleinhuepf

I use from openai import AzureOpenAI I just did not add to the snippet. All I know is that the endpoint out "infrastructure guys" gave me ends with *.openai.azure.com. How do I check for *.openai.azure.com/api or *.openai.azure.com/v1? (hope that is not a silly question)

This always worked for me:

client1 = AzureOpenAI(
                api_version=os.environ["AZURE_OPENAI_API_VERSION"],
                azure_endpoint=os.environ["AZURE_OPENAI_ENDPOINT"],
                api_key=os.environ["AZURE_OPENAI_API_KEY"],
            )

print("\n".join([model.id for model in client1.models.list().data]))

And your code

client2 = OpenAI(base_url=os.environ["AZURE_OPENAI_ENDPOINT"],
                api_key=os.environ["AZURE_OPENAI_API_KEY"])

print("\n".join([model.id for model in client2.models.list().data]))

does fail with:

---------------------------------------------------------------------------
NotFoundError                             Traceback (most recent call last)
Cell In[4], line 4
      1 client2 = OpenAI(base_url=os.environ["AZURE_OPENAI_ENDPOINT"],
      2                 api_key=os.environ["AZURE_OPENAI_API_KEY"])
----> 4 print("\n".join([model.id for model in client2.models.list().data]))

File f:\Documents\anaconda3\envs\prompt-env\Lib\site-packages\openai\resources\models.py:91, in Models.list(self, extra_headers, extra_query, extra_body, timeout)
     77 def list(
     78     self,
     79     *,
   (...)     85     timeout: float | httpx.Timeout | None | NotGiven = NOT_GIVEN,
     86 ) -> SyncPage[Model]:
     87     """
     88     Lists the currently available models, and provides basic information about each
     89     one such as the owner and availability.
     90     """
---> 91     return self._get_api_list(
     92         "/models",
     93         page=SyncPage[Model],
     94         options=make_request_options(
     95             extra_headers=extra_headers, extra_query=extra_query, extra_body=extra_body, timeout=timeout
     96         ),
     97         model=Model,
     98     )

File f:\Documents\anaconda3\envs\prompt-env\Lib\site-packages\openai\_base_client.py:1291, in SyncAPIClient.get_api_list(self, path, model, page, body, options, method)
   1280 def get_api_list(
   1281     self,
   1282     path: str,
   (...)   1288     method: str = "get",
   1289 ) -> SyncPageT:
   1290     opts = FinalRequestOptions.construct(method=method, url=path, json_data=body, **options)
-> 1291     return self._request_api_list(model, page, opts)

File f:\Documents\anaconda3\envs\prompt-env\Lib\site-packages\openai\_base_client.py:1142, in SyncAPIClient._request_api_list(self, model, page, options)
   1138     return resp
   1140 options.post_parser = _parser
-> 1142 return self.request(page, options, stream=False)

File f:\Documents\anaconda3\envs\prompt-env\Lib\site-packages\openai\_base_client.py:919, in SyncAPIClient.request(self, cast_to, options, remaining_retries, stream, stream_cls)
    916 else:
    917     retries_taken = 0
--> 919 return self._request(
    920     cast_to=cast_to,
    921     options=options,
    922     stream=stream,
    923     stream_cls=stream_cls,
    924     retries_taken=retries_taken,
    925 )

File f:\Documents\anaconda3\envs\prompt-env\Lib\site-packages\openai\_base_client.py:1023, in SyncAPIClient._request(self, cast_to, options, retries_taken, stream, stream_cls)
   1020         err.response.read()
   1022     log.debug("Re-raising status error")
-> 1023     raise self._make_status_error_from_response(err.response) from None
   1025 return self._process_response(
   1026     cast_to=cast_to,
   1027     options=options,
   (...)   1031     retries_taken=retries_taken,
   1032 )

NotFoundError: Error code: 404 - {'error': {'code': '404', 'message': 'Resource not found'}}

haesleinhuepf · 2025-03-28T12:42:46Z

And may I ask what the value of os.environ["AZURE_OPENAI_API_VERSION"] is?

sebi06 · 2025-03-28T13:59:31Z

Good point. I checked and it is: 2024-02-15-preview

sebi06 · 2025-04-03T08:06:38Z

7796

Any news or things I can try out?

haesleinhuepf mentioned this issue Apr 3, 2025

New Azure endpoint #237

Open

github-actions bot linked a pull request Apr 3, 2025 that will close this issue

Update Azure endpoint integration and refactor for GitHub models in bia-bob repository. #238

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Using bia-bob with azure hosted OpenAI dodel #236

Using bia-bob with azure hosted OpenAI dodel #236

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Using bia-bob with azure hosted OpenAI dodel #236

Using bia-bob with azure hosted OpenAI dodel #236

Comments

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!