In this Blog we will continue with our AI journey in SQL-Server 2025 preview. If you didn’t read the first Blog you’ll find it here.
In this article, we will explore the concept of data embeddings, a key concept for modern AI-driven applications and platforms. We will then examine the latest capabilities introduced in SQL Server 2025 (preview) and Azure SQL, focusing on their new support for vector data types. Finally, we will generate embeddings using an AI model and store them directly in our database, demonstrating how to integrate AI-powered semantic abstraction into your SQL workloads.
But First: What Exactly Are Data Embeddings?
Let’s start from the back. From a linguistic perspective, written text consists of syntax and semantics. While syntax refers to the structure of the text and the arrangement of words and letters, semantics refers to the actual meaning conveyed by the text. You can have text with a similar syntax but a different semantic and you can also have text with a similar or the same semantic but different syntax. For example:
- Sentence 1: Our customer satisfaction has increased.
- Sentence 2: The contentment of our clients is better than before.
While both sentences have basically the same meaning, they differ in structure and the way the words and letters are arranged.
In IT, we have different simple ways to search for syntax matches, such as exact string matching, wildcard searches or regular expressions (Regex) for more sophisticated syntax matches.
But when we are looking for semantic matches, simple programming functions that only evaluate and compare the arrangement of letters are not enough. In such cases, we need a human, or another form of intelligence, that can interpret those arranged letters and understand their meaning.
That’s exactly where AI comes into play. The AI model basically reads and understands the written text, then it generates a numerical representation for the semantic of that text which we call “embedding”.
When I first heard about embeddings, I asked myself: Okay, but what is the data embedded in?
The answer is that the data is embedded in a high-dimensional mathematical space, with the vector data representing its coordinates within that space. Since the semantics of embedded data are represented as coordinates in a mathematical space, we can calculate the distance between embeddings and use that to search for data with similar meaning.
Enough Theory for Now – Let’s Get Practical!
Lets take a look how we can generate data embeddings with the latest functions and features of SQL-Server 2025 and Azure SQL directly within our database through T-SQL.
For that I’m using the same sample database which we already used in the previous blog post. But in the meantime I created a new column in my “dbo.products” table and I inserted in that column the AI generated product description for each product. If you are wondering how I did that – you can read it here.

select p.product_id, p.product_name, p.Description, p.currency, p.price, c.category_name, b.brand_name, co.color_name, g.gender, s.season from dbo.products p
inner join dbo.category c on p.category_id = c.category_id
inner join dbo.brand b on p.brand_id = b.brand_id
inner join dbo.color co on p.color_id = co.color_id
inner join dbo.gender g on p.gender_id = g.gender_id
inner join dbo.season s on p.season_id = s.season_id

Deploy an embedding Model:
For generating embeddings for our product data we need first an AI model which is capable to do that. Therefore I go into my Azure OpenAI Resource and I’m searching in the Model catalog in the Azure Foundry Portal for embedding models (I show you exactly who I can go there and deploy resources in the previous blog). There are currently three embedding models available from OpenAI. I’m choosing for this blog post the model 3-small and I’m deploying it with a click on “text-embedding-3-small” -> “use this model”->”Deploy” :



Then I’ll find the model in the Azure AI Foundry Portal under the section “Deployments”:

When clicking on the model I’m able to see my models API endpoint and the access key:

Create a Database Scoped Credential:
Okay so far so good, let’s connect to our SQL-Server 2025 Instance where the ShopAI database is running on. To register the external model in our database and access it through the appropriate T-SQL functions, I first create a database-scoped credential. You can create the credential with a managed identity or with an API Key. I’m using the API Key which I copied from the Azure AI Foundry Portal. Note that the credential name must point to a path which is more generic than the request URL:
-- Create database scoped credential:
CREATE DATABASE SCOPED CREDENTIAL [https://shopai-blog.openai.azure.com/]
WITH IDENTITY = 'HTTPEndpointHeaders', secret = '{"api-key":"YOUR-API-KEY"}';
GO


Register the External AI Model:
As far as the database scoped credential is created we can register the external model with the “Create External Model” command, which is currently available in SQL-Server 2025 Preview, in our database. You can find more information’s about this brand new command under the following link from Microsoft: https://learn.microsoft.com/en-us/sql/t-sql/statements/create-external-model-transact-sql?view=sql-server-ver17
-- Create EXTERNAL MODEL
CREATE EXTERNAL MODEL OpenAITextEmbedding3Small
AUTHORIZATION dbo
WITH (
LOCATION = 'YOUR-ENDPOINT_URL',
API_FORMAT = 'Azure OpenAI',
MODEL_TYPE = EMBEDDINGS,
MODEL = 'text-embedding-3-small',
CREDENTIAL = [YOUR-CREDENTIAL],
PARAMETERS = '{"Dimensions":1536}'
);


Test the Embedding Function
As far as we created the external model successfully we can generate the embeddings with the “AI_Generate_Embeddings” function. This function is currently available in SQL-Server 2025 Preview.
The function requires the text to be embedded and the name of a pre-created external model as arguments, with optional model-specific parameters that can be appended in JSON format. You can get more information’s about this brand new function under the following link from Microsoft: https://learn.microsoft.com/en-us/sql/t-sql/functions/ai-generate-embeddings-transact-sql?view=sql-server-ver17
Let’s test the registered model and the embedding function with the command below. As you can see, the result is a JSON array containing multiple numerical values:

Generate and Store Product Embeddings:
Having deployed the embedding model in Azure, registered it in our database, and successfully tested the embedding function, our next step is to generate embeddings for our products.
Therefore I add an additional column called “embedding” to our “dbo.products” table. For the column I’m using the vector data type which is currently available in Azure SQL and in SQL-Server 2025 preview. Note the number in the brackets of the datatype definition is representing the amount of dimensions which are used for the embedding. Who many dimensions you need is depending on the model you are using. The “text-embedding3-small” model from OpenAI is using 1536 dimensions per default:
ALTER TABLE dbo.products
ADD embedding vector(1536);


I now want to generate embeddings for all products in my ShopAI database and store them in the newly created column, taking the following product attributes into account when generating the embeddings:
- Product Name
- Product Description
- Product Price
- Product Season
- Product Gender
To accomplish this, I use the code below to generate the embeddings:
UPDATE p
SET p.embedding = AI_GENERATE_EMBEDDINGS(
CONCAT(
N'Product Name: ', COALESCE(p.product_name, N'Unknown'),
N' | Product Description: ', COALESCE(p.[description], N'Unknown'),
N' | Product Price (CHF): ', COALESCE(CONVERT(NVARCHAR(32), p.price), N'Unknown'),
N' | Product Season: ',
CASE p.season_id
WHEN 1 THEN N'Spring'
WHEN 2 THEN N'Summer'
WHEN 3 THEN N'Autumn'
WHEN 4 THEN N'Winter'
WHEN 5 THEN N'All Seasons'
ELSE N'Unknown'
END,
N' | Product Gender: ',
CASE p.gender_id
WHEN 1 THEN N'Male'
WHEN 2 THEN N'Female'
WHEN 3 THEN N'Unisex'
ELSE N'Unknown'
END
) USE MODEL OpenAITextEmbedding3Small
)
FROM dbo.products AS p
WHERE p.embedding IS NULL;
The query ran for over four minutes and successfully generated embeddings for all products:



Test Semantic Similarity Functions:
So far so good, we were able to deploy an embedding model in Azure, we successfully created a credential for the access to the model, we registered the embedding model with the latest T-SQL command and we were able to generate embeddings for our products.
Let’s see how we can use now the embedding model and the embedded data for semantic search capabilities.
Therefore I’m using the “Vector_Distance” function which is available in SQL-Server 2025 preview and in Azure SQL. The function requires 3 arguments: The first is the metric you want to use for calculating the distance between two embeddings, and the other two are the vectors you want to compare.You can get more information’s about this brand new function under the following link from Microsoft: https://learn.microsoft.com/en-us/sql/t-sql/functions/vector-distance-transact-sql?view=sql-server-ver17
Okay, let’s see how we can use this function – It’s summer and I would like to go on summer vacation (again). Therefore I’m looking for some pants which I can wear during my summer holidays. I don’t like anything flashy, so I’d like the pants in black.
Let’s test with the code below if we can find a matching product for me:
DECLARE @SemanticSearchText Nvarchar(max) = 'Im looking for black pants for men which I can wear during my summer vacation.'
DECLARE @qv VECTOR(1536) = AI_GENERATE_EMBEDDINGS(@SemanticSearchText USE MODEL OpenAITextEmbedding3Small);
SELECT TOP(10)
product_name,
Description,
price,
s.season,
g.gender,
VECTOR_DISTANCE('cosine', @qv, embedding) AS distance
FROM dbo.products p
inner join dbo.season s on p.season_id = s.season_id
inner join dbo.gender g on p.gender_id = g.gender_id
ORDER BY distance;
As you can see, the results are quite accurate. All of the products are pants, except for the one with the highest distance, which is a full swimsuit. Every product is black in color, and most are categorized for the summer season, except for two that are suited for all seasons:

As you can see by combining SQL Server 2025’s new vector capabilities with external embedding models, we can seamlessly integrate semantic understanding directly into our data workflows. This enables intelligent, context-aware queries that go far beyond simple keyword matching.
You can find all the code again on GitHub and test it by your own. Feel free to share your thoughts with me in the comment section and stay tuned for upcoming posts from my colleagues and me. 😉