Semantic Kernel:图识文

多模态是每个LLM具有的能力,图片又是最常见的信息载体,GPT对图片的识别也很早就有了,随着GPT版本的迭代,效果越来越好。SK也是在很多就适配了图识文,只不过最近版本才支持本地图片的上传。(有点晚)
图片场景识别:

using Microsoft.SemanticKernel.ChatCompletion;using Microsoft.SemanticKernel;using Microsoft.SemanticKernel.Connectors.OpenAI;
var chatModelId = "gpt-4o";var key = File.ReadAllText(@"C:GPTkey.txt");#pragma warning disable SKEXP0070#pragma warning disable SKEXP0010#pragma warning disable SKEXP0001#pragma warning disable SKEXP0110var kernel = Kernel.CreateBuilder() .AddOpenAIChatCompletion(chatModelId, key) .Build();
var chat = kernel.GetRequiredService<IChatCompletionService>();var chatHistory = new ChatHistory();chatHistory.AddUserMessage(new ChatMessageContentItemCollection{ new TextContent("请说明这是那里,什么样的天气,大家在干什么?一共有多少人"), new ImageContent(File.ReadAllBytes("tam.jpg"),"image/jpeg")});var settings = new Dictionary<string, object>{ ["max_tokens"] = 1000, ["temperature"] = 0.2, ["top_p"] = 0.8, ["presence_penalty"] = 0.0, ["frequency_penalty"] = 0.0};
var content = chat.GetStreamingChatMessageContentsAsync(chatHistory, new PromptExecutionSettings{ ExtensionData = settings});await foreach (var item in content){ Console.Write(item.Content);}Console.ReadLine();

图片:

图片

结果:

图片

文字识别:

using Microsoft.SemanticKernel.ChatCompletion;using Microsoft.SemanticKernel;using Microsoft.SemanticKernel.Connectors.OpenAI;
var chatModelId = "gpt-4o";var key = File.ReadAllText(@"C:GPTkey.txt");#pragma warning disable SKEXP0070#pragma warning disable SKEXP0010#pragma warning disable SKEXP0001#pragma warning disable SKEXP0110var kernel = Kernel.CreateBuilder() .AddOpenAIChatCompletion(chatModelId, key) .Build();
var chat = kernel.GetRequiredService<IChatCompletionService>();var chatHistory = new ChatHistory();chatHistory.AddUserMessage(new ChatMessageContentItemCollection{ new TextContent("请识别图片上的文字,并输出"), new ImageContent(File.ReadAllBytes("japancard.png"),"image/jpeg")});var settings = new Dictionary<string, object>{ ["max_tokens"] = 1000, ["temperature"] = 0.2, ["top_p"] = 0.8, ["presence_penalty"] = 0.0, ["frequency_penalty"] = 0.0};
var content = chat.GetStreamingChatMessageContentsAsync(chatHistory, new PromptExecutionSettings{ ExtensionData = settings});await foreach (var item in content){ Console.Write(item.Content);}Console.ReadLine();

图片:

图片

结果:

图片

声明:文中观点不代表本站立场。本文传送门:https://eyangzhen.com/419719.html

联系我们
联系我们
分享本页
返回顶部