Microsoft launches preview of new AI abstraction library for .NET

On 29 October 2024, Microsoft unveiled the Microsoft.Extensions.VectorData.Abstractions library, enhancing .NET integration with the AI Semantic Kernel SDK and simplifying operations for developers.

On 29 October 2024, Microsoft made strides in the realm of artificial intelligence and developer technology with the preview release of the Microsoft.Extensions.VectorData.Abstractions library for .NET. This new library is designed to simplify the integration of .NET solutions with the AI Semantic Kernel SDK by providing abstractions over various AI implementations and models.

Microsoft’s Semantic Kernel is an SDK aimed at enterprises that enables developers to integrate different Large Language Models (LLMs) and languages. It boasts automatic orchestrations for these plugins, enhancing flexibility and functionality for developers working within various coding environments. This release is part of a series of tools developed through a collaborative effort between the Semantic Kernel and .NET teams at Microsoft. Previously, Microsoft released the Microsoft.Extensions.AI library, which abstracts common AI services such as chat clients.

The latest addition, Microsoft.Extensions.VectorData.Abstractions, is focused on the simplification of vector stores used for LLM embeddings. In artificial intelligence, an embedding is a method of representing data records within a high-dimensional vector space, which enables the conversion of discrete data into a format suitable for processing by LLM neural networks. Through this approach, records that are semantically similar are positioned closer together in the vector space, allowing for more nuanced semantic search capabilities rather than relying solely on text matching.

The Microsoft.Extensions.VectorData.Abstractions library offers support for Create, Read, Update, Delete (CRUD), and search operations. Developers can work with .NET Plain Old CLR Objects (POCO) annotated with vector attributes such as VectorStoreRecordKey, VectorStoreRecordData, and VectorStoreRecordVector. An illustrative example is a Movie class, where each movie instance includes a key, title, description, and a computed vector property that encapsulates the embedding of the record in the LLM with 384 dimensions, using a cosine-similarity distance function.

In practice, the abstraction library utilises IEmbeddingGenerator and IVectorStore interfaces for storage and embedding. Semantic Kernel provides the in-memory vector store, while an embedding generator like Ollama pre-made LLM package running locally on the developer’s machine uses models such as all-minilm small for embedding. This setup is depicted in the code as:

csharp IEmbeddingGenerator<string, Embedding<float>> generator = new OllamaEmbeddingGenerator(new Uri("http://localhost:11434/"), "all-minilm");

The embedding process involves generating a vector from a movie’s description using IEmbeddingGenerator.GenerateEmbeddingVectorAsync, after which the record is inserted into the vector store:

csharp movie.Vector = await generator.GenerateEmbeddingVectorAsync(movie.Description); await movies.UpsertAsync(movie);

Additionally, querying the embedded records is performed by embedding query text into vectors using the same interface. For instance, a query for “A family friendly movie” is constructed as:

csharp var queryEmbedding = await generator.GenerateEmbeddingVectorAsync(query);

The VectorizedSearchAsync method of the vector data store interface then finds the records most similar to the given query prompt:

csharp var searchOptions = new VectorSearchOptions() { Top = 1, VectorPropertyName = "Vector" }; var results = await movies.VectorizedSearchAsync(queryEmbedding, searchOptions);

Microsoft provides comprehensive code examples on their blog and the Semantic Kernel learning site to guide developers through these processes. A significant application of the vector store abstraction library is extending LLMs with custom data stores using retrieval-augmented generation (RAG). This technique allows LLMs to query specific knowledge bases without retraining models, and a full example of vector store RAG is available.

Currently a preview, the library is expected to remain in this phase until the release of .NET 9. Developers can engage with Microsoft’s development team by submitting feedback through the GitHub repository issue list.

Looking ahead, Microsoft aims to:

Enhance collaboration with the Semantic Kernel to introduce more streamlined experiences in RAG scenarios.
Partner with vector store collaborators to integrate Microsoft.Extensions.VectorData into a broader .NET ecosystem.

Source: Noah Wire Services

More on this & sources

https://learn.microsoft.com/en-us/semantic-kernel/concepts/vector-store-connectors/generic-data-model – This link explains the use of the generic data model in the Microsoft.Extensions.VectorData.Abstractions library, including how to define and use vector stores with the Semantic Kernel.
https://www.infoq.com/news/2024/10/dotnet-ai-integration-libraries/ – This article discusses the preview release of Microsoft.Extensions.AI and Microsoft.Extensions.AI.Abstractions, which are part of the broader effort to integrate AI services into .NET applications, including the collaboration with Semantic Kernel.
https://devblogs.microsoft.com/dotnet/introducing-microsoft-extensions-ai-preview/ – This blog post introduces the Microsoft.Extensions.AI and Microsoft.Extensions.AI.Abstractions libraries, highlighting their role in providing unified abstractions for AI services, which is relevant to the integration with Semantic Kernel.
https://devblogs.microsoft.com/dotnet/introducing-microsoft-extensions-vector-data/ – This article introduces the Microsoft.Extensions.VectorData.Abstractions library, focusing on its role in simplifying the integration of vector stores for LLM embeddings and its collaboration with the Semantic Kernel.
https://learn.microsoft.com/en-us/semantic-kernel/concepts/vector-store-connectors/generic-data-model – This link provides details on how the library supports CRUD and search operations using .NET POCO objects annotated with vector attributes, which is a key aspect of the Microsoft.Extensions.VectorData.Abstractions library.
https://devblogs.microsoft.com/dotnet/introducing-microsoft-extensions-vector-data/ – This article explains the use of IEmbeddingGenerator and IVectorStore interfaces for storage and embedding, and how Semantic Kernel and embedding generators like Ollama are used.
https://devblogs.microsoft.com/dotnet/introducing-microsoft-extensions-ai-preview/ – This blog post discusses the embedding process and how queries are constructed using the embedding generator interface, which is crucial for the vector store abstraction library.
https://learn.microsoft.com/en-us/semantic-kernel/concepts/vector-store-connectors/generic-data-model – This link provides code examples and explanations on how to query embedded records using the VectorizedSearchAsync method, which is a significant application of the vector store abstraction library.
https://www.infoq.com/news/2024/10/dotnet-ai-integration-libraries/ – This article mentions the expectation that the library will remain in preview until the release of .NET 9 and the importance of developer feedback through the GitHub repository issue list.
https://devblogs.microsoft.com/dotnet/introducing-microsoft-extensions-vector-data/ – This article outlines Microsoft’s future plans to enhance collaboration with the Semantic Kernel and integrate Microsoft.Extensions.VectorData into a broader .NET ecosystem.
https://devblogs.microsoft.com/dotnet/introducing-microsoft-extensions-ai-preview/ – This blog post discusses the broader ecosystem integration and the partnership with vector store collaborators to extend the capabilities of the Microsoft.Extensions.VectorData library.